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Description 

[0001] The present invention relates to novel HIV protein constructs, to their use in medicine, to pharmaceutical 
compositions containing them and to methods of their manufacture. 
5 [0002] In particular, the invention relates to fusion proteins comprising HIV-1 Tat and/or Nef proteins. 

[0003] HIV-1 is the primary cause of the acquired immune deficiency syndrome (AIDS) which is regarded as one of 
the world's major health problems. Although extensive research throughout the world, has been conducted to produce 
a vaccine, such efforts thus far, have not been successful. 

[0004] Non-envelope proteins of HIV-1 have been described and include for example internal structural proteins such 
10 as the products of the gag and poi genes and, other non-structural proteins such as Rev, Nef, Vif and Tat (Greene et 
al., New England J. Med, 324, 5, 308 et seq (1 991 ) and Bryant et al. (Ed. Pizzo), Pediatr. Infect. Dis. J., 11, 5, 390 et 
seq (1992). 

[0005] HIV Nef and Tat proteins are early proteins, that is, they are expressed early in infection and in the absence 
of structural proteins. 

15 [0006] According to the present invention there is provided a protein comprising 

(a) an entire HIV Tat protein or Tat with a C terminal histidine tail, or a mutated Tat which has undergone deletion, 
addition or substitution of one amino acid, or a mutated Tat as defined by SEQ ID NO. 23 , linked to either (i) a 
protein or lipoprotein fusion partner or (ii) an entire HIV Nef protein or Nef with a C-terminal histidine tail, or Nef 

20 which has undergone deletion, addition or substitution of one amino acid; or 

(b) an entire HIV Nef protein or Nef with a C- terminal histidine tail, or Nef which has undergone deletion, addition 
or substitution of one amino acid, linked to either (i) a protein or lipoprotein fusion partner or (ii) an entire HIV Tat 
protein or Tat with a C terminal histidine tail, or a mutated Tat which has undergone deletion, addition or substitution 
of one amino acid, or a mutated Tat as defined by in SEQ ID NO. 23; or 

25 (c) an entire HIV Nef protein or Nef with a C- terminal histidine tail, or Nef which has undergone deletion, addition 

or substitution of one amino acid, linked to an entire HIV Tat protein or Tat with a C terminal histidine tail, or a 
mutated Tat which has undergone deletion, addition or substitution of one amino acid, or a mutated Tat as defined 
by as defined by in SEQ ID NO. 23, and a protein or lipoprotein fusion partner, 

30 By 'fusion partner' is meant any protein sequence that is not Tat or Nef. Preferably the fusion partner is protein D or its' 
lipidated derivative Lipoprotein D, from Haemophilius influenzae B. In particular, it is preferred that the N-terminal third, 
i.e. approximately the first 100-130 amino acids are utilised. This is represented herein as Lipo D 1 /3. In a preferred 
embodiment of the invention the Nef protein or derivative thereof may be linked to the Tat protein or derivative thereof 
Such Nef-Tat fusions may optionally also be linked to a protein or lipoprotein fusion partner, such as protein D. 

35 [0007] The fusion partner is normally linked to the N-terminus of the Nef or Tat protein. 

[0008] Derivatives encompassed within the present invention include molecules with a C terminal Histidine tail which 
preferably comprises between 5-10 Histidine residues. Generally, a histidine tail containing n residues is represented 
herein as His (n). The presence of an histidine (or 'His') tail aids purification. More specifically, the invention provides 
proteins with the following structure 

40 

Lipo D 1/3 
Lipo D 1/3 
ProtD 1/3 
Prot D 1/3 

45 



Nef 

Nef-Tat 
Nef 

Nef-Tat 
Nef-Tat 



His ( 6 ) 
His (6) 
His ( 6 ) 
His (6) 
His ( 6 ) 



Figure 1 provides the amino-acid (Seq. ID. No. 7) and DNA sequence (Seq. ID. No. 6) of the fusion partner for such 
constructs. 

so [0009] In a preferred embodiment the proteins are expressed with a Histidine tail comprising between 5 to 10 and 
preferably six Histidine residues. These are advantageous in aiding purification. Separate expression, in yeast (Sac- 
charomyces cerevisiae), of Nef (Macreadie I.G. et al., 1 993, Yeast 9 (6) 565-573) and Tat (Braddock M et al., 1 989, Cell 
58 (2) 269-79) has already been reported Nefprotein only is myristilated. The present invention provides for the first time 
the expression of Nef and Tat separately in a Pichia expression system (Nef-His and Tat-His constructs), and the 

55 successful expression of a fusion construct Nef-Tat-His. The DNA and amino acid sequences of representative Nef-His 
(Seq. ID. No.s 8 and 9), Tat-His (Seq. ID. No.s 10 and 11)and of Nef-Tat-His fusion proteins (Seq. ID. No.s 12 and 13) 
are set forth in Figure 2. 

[0010] Derivatives encompassed within the present invention also include mutated proteins. The term 'mutated' is 
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used herein to mean a molecule which has undergone deletion, addition or substitution of one or more amino acids 
using well known techniques for site directed mutagenesis or any other conventional method. 

[0011] A mutated Tat is illustrated in Figure 2 (Seq. ID. No.s 22 and 23) as is a Nef-Tat Mutant- His (Seq. ID. No.s 24 
and 25). 

5 [0012] The present invention also provides a DNA encoding the proteins of the present invention. Such sequences 
can be inserted into a suitable expression vector and expressed in a suitable host. 

[0013] A DNA sequence encoding the proteins of the present invention can be synthesized using standard DNA 
synthesis techniques, such as by enzymatic ligation as described by D.M. Roberts et al. in Biochemistry 1985, 24, 
5090-5098, by chemical synthesis, by in vitro enzymatic polymerization, or by PCR technology utilising for example a 

10 heat stable polymerase, or by a combination of these techniques. 

[001 4] Enzymatic polymerisation of DNA may be carried out in vitro using a DNA polymerase such as DNA polymerase 
I (Klenow fragment) in an appropriate buffer containing the nucleoside triphosphates dATP, dCTP, dGTP and dTTP as 
required at a temperature of 10°-37°C, generally in a volume of 5QjlI or less. Enzymatic ligation of DNA fragments may 
be carried out using a DNA ligase such as T4 DNA ligase in an appropriate buffer, such as 0.05M Tris (pH 7.4), 0.01 M 

15 MgCI 2 , 0.01 M dithiothreitol, 1 mM spermidine, 1 mM ATP and 0. 1 mg/ml bovine serum albumin, at a temperature of 4 ° C 
to ambient, generally in a volume of 50ml or less. The chemical synthesis of the DNA polymer or fragments may be 
carried out by conventional phosphotriester, phosphite or phosphoramidite chemistry, using solid phase techniques such 
as those described in 'Chemical and Enzymatic Synthesis of Gene Fragments - A Laboratory Manual' (ed. H.G. Gassen 
and A. Lang), Verlag Chemie, Weinheim (1 982), or in other scientific publications, for example M.J. Gait, H.W.D. Matthes, 

20 M. Singh, B.S. Sproat, and R.C. Titmas, Nucleic Acids Research, 1982, 10, 6243; B.S. Sproat, and W. Bannwarth, 
Tetrahedron Letters, 1983, 24, 5771; M.D. Matteucci and M.H. Caruthers, Tetrahedron Letters, 1980, 21, 719; M.D. 
Matteucci and M.H. Caruthers, Journal of the American Chemical Society, 1 981 , 103, 31 85; S.P. Adams etai., Journal 
of the American Chemical Society, 1 983, 1 05, 661 ; N.D. Sinha, J. Biernat, J. McMannus, and H. Koester, Nucleic Acids 
Research, 1984, 12,4539; and H.W.D. Matthes etai., EMBO Journal, 1984, 3, 801. 

25 [001 5] The invention also provides a process for preparing a protein of the invention, the process comprising the steps 
of : 

i) preparing a replicable or integrating expression vector capable, in a host cell, of expressing a DNA polymer 
comprising a nucleotide sequence that encodes the protein or a derivative thereof 
30 jj) transforming a host cell with said vector 

iii) culturing said transformed host cell under conditions permitting expression of said DNA polymer to produce said 
protein; and 

iv) recovering said protein 

35 [0016] The process of the invention may be performed by conventional recombinant techniques such as described in 
Maniatis etai., Molecular Cloning - A Laboratory Manual; Cold Spring Harbor, 1982-1989. 

[0017] The term 'transforming' is used herein to mean the introduction of foreign DNA into a host cell. This can be 
achieved for example by transformation, transfection or infection with an appropriate plasmid or viral vector using e.g. 
conventional techniques as described in Genetic Engineering; Eds. S.M. Kingsman and A.J. Kingsman; Blackwell Sci- 
^o entific Publications; Oxford, England, 1988. The term 'transformed' or 'transformant' will hereafter apply to the resulting 
host cell containing and expressing the foreign gene of interest. 
[0018] The expression vectors are novel and also form part of the invention. 

[0019] The replicable expression vectors may be prepared in accordance with the invention, by cleaving a vector 
compatible with the host cell to provide a linear DNA segment having an intact replicon, and combining said linear 

45 segment with one or more DNA molecules which, together with said linear segment encode the desired product, such 
as the DNA polymer encoding the protein of the invention, or derivative thereof, under ligating conditions. 
[0020] Thus, the DNA polymer may be preformed or formed during the construction of the vector, as desired. 
[0021] The choice of vector will be determined in part by the host cell, which may be prokaryotic or eukaryotic but 
preferably is E. coliox yeast. Suitable vectors include plasmids, bacteriophages, cosmids and recombinant viruses. 

50 [0022] The preparation of the replicable expression vector may be carried out conventionally with appropriate enzymes 
for restriction, polymerisation and ligation of the DNA, by procedures described in, for example, Maniatis etai. cited above. 
[0023] The recombinant host cell is prepared, in accordance with the invention, by transforming a host cell with a 
replicable expression vector of the invention under transforming conditions. Suitable transforming conditions are con- 
ventional and are described in, for example, Maniatis et al. cited above, or " DNA Cloning" Vol. II, D.M. Glover ed., IRL 

55 Press Ltd, 1985. 

[0024] The choice of transforming conditions is determined by the host cell. Thus, a bacterial host such as E. coli may 
be treated with a solution of CaCI 2 (Cohen et ai., Proc. Nat. Acad. Sci., 1 973, 69, 21 1 0) or with a solution comprising a 
mixture of RbC1 , MnCI 2 , potassium acetate and glycerol, and then with 3-[N-morpholino]-propane-sulphonic acid, RbC1 
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and glycerol. Mammalian cells in culture may be transformed by calcium co-precipitation of the vector DNA onto the 
cells. The invention also extends to a host cell transformed with a replicable expression vector of the invention. 
[0025] Culturing the transformed host cell under conditions permitting expression of the DNA polymer is carried out 
conventionally, as described in, for example, Maniatis et al. and "DNA Cloning" cited above. Thus, preferably the cell is 

5 supplied with nutrient and cultured at a temperature below 50 °C. 

[0026] The product is recovered by conventional methods according to the host cell. Thus, where the host cell is 
bacterial, such as E. coli - or yeast such as Pichia; it may be lysed physically, chemically or enzymatically and the protein 
product isolated from the resulting lysate. Where the host cell is mammalian, the product may generally be isolated from 
the nutrient medium or from cell free extracts. Conventional protein isolation techniques include selective precipitation, 

10 adsorption chromatography, and affinity chromatography including a monoclonal antibody affinity column. 

[0027] For proteins of the present invention provided with Histidine tails, purification can easily be achieved by the 
use of a metal ion affinity column. In a preferred embodiment, the protein is further purified by subjecting it to cation ion 
exchange chromatography and/or Gel filtration chromatography. The protein is then sterilised by passing through a 0.22 
Iuliti membrane. 

15 [0028] The proteins of the invention can then be formulated as a vaccine, or the Histidine residues enzymatically 
cleared. 

[0029] The proteins of the present invention are provided preferably at least 80% pure more preferably 90% pure as 
visualised by SDS PAGE. Preferably the proteins appear as a single band by SDS PAGE. 

[0030] The present invention also provides pharmaceutical composition comprising a protein of the present invention 
20 in a pharmaceutical^ acceptable excipient. 

[0031] Vaccine preparation is generally described in New Trends and Developments in Vaccines, Voller et al. 
(eds.), University Park Press, Baltimore, Maryland, 1978. Encapsulation within liposomes is described by Fullerton, US 
Patent 4,235,877. 

[0032] The proteins of the present invention are preferably adjuvanted in the vaccine formulation of the invention. 
25 Suitable adjuvants include an aluminium salt such as aluminium hydroxide gel (alum) or aluminium phosphate, but may 
also be a salt of calcium, iron or zinc, or may be an insoluble suspension of acylated tyrosine, or acylated sugars, 
cationically or anionically derivatised polysaccharides, or polyphosphazenes. 

[0033] In the formulation of the inventions it is preferred that the adjuvant composition induces a preferential TH1 
response. Suitable adjuvant systems include, for example, a combination of monophosphoryl lipid A or derivative thereof, 
30 preferably 3-de-O-acylated monophosphoryl lipid A (3D-MPL) together with an aluminium salt. 

[0034] An enhanced system involves the combination of a monophosphoryl lipid A and a saponin derivative particularly 
the combination of QS21 and 3D- MPL as disclosed in WO 94/00153, or a less reactogenic composition where the QS21 
is quenched with cholesterol as disclosed in WO 96/33739. 

[0035] A particularly potent adjuvant formulation involving QS21 , 3D-MPL & tocopherol in an oil in water emulsion is 
35 described in WO 95/17210 and is a preferred formulation. 

[0036] Accordingly in one embodiment of the present invention there is provided a vaccine comprising a protein 
according to the invention adjuvanted with a monophosphoryl lipid A or derivative thereof, especially 3D-MPL. 
[0037] Preferably the vaccine additionally comprises a saponin, more preferably QS21 . 

[0038] Preferably the formulation additional comprises an oil in water emulsion and tocopherol. The present invention 
40 also provides a method for producing a vaccine formulation comprising mixing a protein of the present invention together 
with a pharmaceutically acceptable excipient, such as 3D-MPL. 

[0039] The vaccine of the present invention may additional comprise further HIV proteins, such as the envelope 
glycoprotein gp160 or its derivative gp 120. 

[0040] In another aspect, the invention relates to an HIV Nef or an HIV Tat protein or derivative thereof expressed in 
45 Pichia pastoris. 

[0041] The invention will be further described by reference to the following examples: 
EXAMPLES: 
50 General 

[0042] Nef and Tat proteins, two regulatory proteins encoded by the human immunodeficiency virus (HIV-1) were 
produced in E.coli and in the methylotrophic yeast Pichia pastoris, 

[0043] The nef gene from the Bru/Lai isolate (Cell 40: 9-17, 1985) was selected for these constructs since this gene 
55 is among those that are most closely related to the consensus Nef. 

[0044] The starting material for the Bru/Lai nef gene was a 1 1 70bp DNA fragment cloned on the mammalian expression 
vector pcDNA3 (pcDNA3/ne/). 

[0045] The faf gene originates from the BH10 molecular clone. This gene was received as an HTLV III cDNA clone 
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named pCV1 and described in Science, 229, p69-73, 1985. 

1. EXPRESSION OF HIV-1 nef AND tat SEQUENCES IN E.COLI. 

5 [0046] Sequences encoding the Nef protein as well as a fusion of nef and fa? sequences were placed in plasmids 
vectors: pRIT14586 and pRIT14589 (see figure 1). 

[0047] Nef and the Nef-Tat fusion were produced as fusion proteins using as fusion partner a part of the protein D. 
Protein D is an immunoglobulin D binding protein exposed at the surface of the gram-negative bacterium Haemophilus 
influenzae. 

10 [0048] pRIT14586 contains, under the control of a XPL promoter, a DNA sequence derived from the bacterium Hae- 
mophilus influenzae which codes for the first 127 amino acids of the protein D (Infect. Immun. 60 : 1336-1342, 1992), 
immediately followed by a multiple cloning site region plus a DNA sequence coding for one glycine, 6 histidines residues 
and a stop codon (Fig. 1 A). 

[0049] This vector is designed to express a processed lipidated His tailed fusion protein (LipoD fusion protein). The 
15 fusion protein is synthesised as a precursor with an 1 8 amino acid residues long signal sequence and after processing, 
the cysteine at position 19 in the precursor molecule becomes the amino terminal residue which is then modified by 
covalently bound fatty acids (Fig. 1 B). 

[0050] pRIT14589 is almost identical to pRIT14586 except that the protD derived sequence starts immediately after 
the cysteine 19 codon. 

20 Expression from this vector results in a His tailed, non lipidated fusion protein (Prot D fusion protein). 

[0051] Four constructs were made: LipoD-nef-His, LipoD-ne/ tat-W\s, ProtD-nef His, and ProtD-nef tat-W\s. 

[0052] The first two constructs were made using the expression vector pRIT14586, the last two constructs used 

pRIT14589. 

25 1 .1 CONSTRUCTION OF THE RECOMBINANT STRAIN ECLD-N1 PRODUCING THE LIPOD-Nef-HIS FUSION PRO- 
TEIN. 

1.1.1 Construction of the MpoD-Aie/-His expression plasmid pRIT14595 
30 [0053] The nef gene(Bru/Lai isolate) was amplified by PCR from pcDNA3/Nef plasmid with primers 01 and 02. 

Ncol 

PRIMER 01 (Seq ID NO 1): 5 ' ATCGTCCATG.GGT.GGC. AAG.TGG.T 3' 



40 

Spel 

PRIMER 02 (Seq ID NO 2): 5' CGGCT ACTAGT GCAGTTCTTGAA 3' 

45 

[0054] The nefDNA region amplified starts at nucleotide 8357 and terminates at nucleotide 8971 (Cell, 40: 9-1 7, 1 985). 
[0055] An Ncol restriction site ( which carries the ATG codon of the nef gene) was introduced at the 5'end of the PCR 
fragment while a Spel site was introduced at the 3' end. 
so [0056] The PCR fragment obtained and the expression plasmid pRIT14586 were both restricted by Ncol and Spel, 
purified on an agarose gel, ligated and transformed in the appropriate E.coli host cell, strain AR58.This strain is a cryptic 
X lysogen derived from N99 that is ga/E::Tn10, A-8 (chlD-pgl), A-H1 (cro-chIA), N+, and cl857. 

[0057] The resulting recombinant plasmid received, after verification of the nef amplified region by automatic sequenc- 
ing, (see section 1 .1 .2 below) the pRIT14595 denomination. 

55 

1.1.2 Selection of transformants of E. Coli strain AR58 with pRIT14595. 

[0058] When transformed in AR58 E.coli host strain, the recombinant plasmid directs the heat-inducible production 
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of the heterologous protein. 

[0059] Heat inducible protein production of several recombinant lipoD-Nef-His transformants was analysed by Coomas- 
sie Blue stained SDS-PAGE. All the transformants analysed showed an heat inducible heterologous protein production. 
The abundance of the recombinant Lipo D-Nef-Tat-His fusion protein was estimated at 10% of total protein. 
5 [0060] One of the transformants was selected and given the laboratory accession number ECLD-N1 . 

[0061] The recombinant plasmid was reisolated from strain ECLD-N1 , and the sequence of the nef-H\s coding region 
was confirmed by automated sequencing .This plasmid received the official designation pRIT14595. 
[0062] The fully processed and acylated recombinant Lipo D-neAHis fusion protein produced by strain ECLD-N1 is 
composed of: 

10 

° Fatty acids 

°109 a.a. of proteinD (starting at a.a.19 and extending to a. a. 127). 
°A methionine, created by the use of Ncol cloning site of pRIT14586 (Fig. 1). 
° 205a. a. of Nef protein (starting at a.a2 and extending to a. a. 206). 
15 °A threonine and a serine created by the cloning procedure (cloning at Spel site of pRIT14586). 

°One glycine and six histidines. 

1.2 CONSTRUCTION OF RECOMBINANT STRAIN ECD-N1 PRODUCING PROT D-Nef-HIS FUSION PROTEIN. 

20 [0063] Construction of expression plasmid pRIT14600 encoding the Prot D-Nef-His fusion protein was identical to the 
plasmid construction described in example 1 .1 .1 with the exception that pRIT14589 was used as receptor plasmid for 
the PCR amplified nef fragment. 

[0064] E.coli AR58 strain was transformed with pRIT1 4600 and transformants were analysed as described in example 
1 .1 .2. The transformant selected received laboratory accession number ECD-N1 . 

25 

1.3 CONSTRUCTION OF RECOMBINANT STRAIN ECLD-NT6 PRODUCING THE LIPO D-Nef-Tat-HIS FUSION 
PROTEIN. 

1.3.1 Construction of the lipo D-Nef-Tat-His expression plasmid pRIT14596 

30 

[0065] The faf gene(BH1 0 isolate) was amplified by PCR from a derivative of the pCV1 plasmid with primers 03 and 
04. Spel restriction sites were introduced at both ends of the PCR fragment. 

35 Spel 

PRIMER 03 (Seq ID NO 3): 5' ATCGT ACTAGT. GAG.CCA.GTA.GAT.C 3' 

40 

Spel 

PRIMER 04 (Seq ID NO 4): 5' CGGCT ACTAGT TTCCTTCGGGCCT 3' 

45 

[0066] The nucleotide sequence of the amplified faf gene is illustrated in the pCV1 clone (Science 229 : 69-73, 1985) 
and covers nucleotide 5414 till nucleotide 7998. 
so [0067] The PCR fragment obtained and the plasmid pRIT1 4595 (expressing lipoD-Nef-His protein) were both digested 
by Spel restriction enzyme, purified on an agarose gel, ligated and transformed in competent AR58 cells. The resulting 
recombinant plasmid received, after verification of the tat amplified sequence by automatic sequencing (see section 

1 .3.2 below), the pRIT14596 denomination. 

55 1.3.2 Selection of transformants of strain AR58 with pRIT14596 

[0068] Transformants were grown, heat induced and their proteins were analysed by Coomassie Blue stained gels. 
The production level of the recombinant protein was estimated at 1 % of total protein. One recombinant strain was 
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selected and received the laboratory denomination ECLD-NT6. 

[0069] The WpoD-nef-tat -His recombinant plasmid was reisolated from ECLD-NT6 strain, sequenced and received 
the official designation pRIT14596. 

[0070] The fully processed and acylated recombinant Lipo D-Nef-Tat-His fusion protein produced by strain ECLD-N6 
5 is composed of: 

° Fatty acids 

°109 a.a. of proteinD (starting at a.a.19 and extending to a. a. 127). 
°A methionine, created by the use of Ncol cloning site of pRIT14586. 
10 ° 205a. a. of the Nef protein (starting at a.a2 and extending to a. a. 206) 

° A threonine and a serine created by the cloning procedure 
°85a.a. of the Tat protein (starting at a.a2 and extending to a.a. 86) 
° A threonine and a serine introduced by cloning procedure 
°One glycine and six histidines. 

15 

1.4 CONSTRUCTION OF RECOMBINANT STRAIN ECD-NT1 PRODUCING PROT D-Nef-Tat-HIS FUSION PRO- 
TEIN. 

[0071] Construction of expression plasmid pRIT14601 encoding the Prot D-Nef Tat-His fusion protein was identical 
20 to the plasmid construction described in example 1 .3.1 with the exception that pRIT1 4600 was used as receptor plasmid 
for the PCR amplified nef fragment. 

[0072] E.cofi AR58 strain was transformed with pRIT1 4601 and transformants were analysed as described previously. 
The transformant selected received laboratory accession number ECD-NT1. 

25 2. EXPRESSION OF HIV-1 nef AND tat SEQUENCES IN PICHIA PASTORIS. 

[0073] Nef protein, Tat protein and the fusion Nef-Tat were expressed in the methylotrophic yeast Pichia pastoris 
under the control of the inducible alcohol oxidase (AOX1) promoter. 

[0074] To express these HIV-1 genes a modified version of the integrative vector PHIL-D2 (INVITROGEN) was used. 
30 This vector was modified in such a way that expression of heterologous protein starts immediately after the native ATG 
codon of the AOX1 gene and will produce recombinant protein with a tail of one glycine and six histidines residues. This 
PHIL-D2-MOD vector was constructed by cloning an oligonucleotide linker between the adjacent Asull and EcoRI sites 
of PHIL-D2 vector (see Figure 3). In addition to the His tail, this linker carries Ncol, Spel andXbal restriction sites between 
which nef, fa? and nef-tat fusion were inserted. 

35 

2.1 CONSTRUCTION OF THE INTEGRATIVE VECTORS pRIT14597 (encoding Nef-His protein), pRIT14598 (en- 
coding Tat-His protein) and pRIT14599 (encoding fusion Nef-Tat-His). 

[0075] The nef gene was amplified by PCR from the pcDNA3/Nef plasmid with primers 01 and 02(see section 1 .1 .1 
construction of pRIT1 4595). The PCR fragment obtained and the integrative PHIL-D2-MOD vector were both restricted 
by Ncol and Spel, purified on agarose gel and ligated to create the integrative plasmid pRIT14597 (see Figure 3). 
[0076] The faf gene was amplified by PCR from a derivative of the pCV 1 plasmid with primers 05 and 04(see section 
1.3.1 construction of pRIT1 4596): 

45 

Ncol 

PRIMER 05 (Seq ID NO 5): 5 ' ATCGTCCATGGAGCC AGTAGATC 3' 

50 

[0077] An Ncol restriction site was introduced at the 5' end of the PCR fragment while a Spel site was introduced at 
the 3' end with primer 04. The PCR fragment obtained and the PHIL-D2-MOD vector were both restricted by Ncol and 
55 Spel, purified on agarose gel and ligated to create the integrative plasmid pRIT14598. 

[0078] To construct pRIT1 4599, a 91 Obp DNA fragment corresponding to the nef-tat-H\s coding sequence was ligated 
between the EcoRI blunted(T4 polymerase) and Ncol sites of the PHIL-D2-MOD vector. The nef-tat-\-\\s coding fragment 
was obtained by Xbal blunted(T4 polymerase) and Ncol digestions of pRIT14596. 
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2.2 TRANSFORMATION OF PICHIA PASTORIS STRAIN GS1 15(his4). 

[0079] To obtain Pichia pastoris strains expressing Nef-His, Tat-His and the fusion Nef-Tat-His, strain GS 115 was 
transformed with linear Notl fragments carrying the respective expression cassettes plus the HIS4 gene to complement 
5 his4 in the host genome. Transformation of GS1 15 with Notl-linear fragments favors recombination at the AOXI locus. 
[0080] Multicopy integrant clones were selected by quantitative dot blot analysis and the type of integration, insertion 
(Mut + phenotype) or transplacement (Mut 5 phenotype), was determined. 

[0081] From each transformation, one transformant showing a high production level for the recombinant protein was 
selected : 

w [0082] Strain Y1738 (Mut + phenotype) producing the recombinant Nef-His protein, a myristylated 215 amino acids 
protein which is composed of: 

°Myristic acid 

°A methionine, created by the use of Ncol cloning site of PHIL-D2-MOD vector 
15 °205 a.a. of Nef p rote i restarting at a. a. 2 and extending to a. a. 206) 

° A threonine and a serine created by the cloning procedure (cloning at Spel site of PHIL-D2-MOD vector. 
°One glycine and six histidines. 

[0083] Strain Y1739 (Mut + phenotype) producing the Tat-His protein, a 95 amino acid protein which is composed of: 

20 

° A methionine created by the use of Ncol cloning site 
°85 a.a. of the Tat protein(starting at a.a. 2 and extending to a.a. 86) 
° A threonine and a serine introduced by cloning procedure 
°One glycine and six histidines 

25 

[0084] Strain Y1 737(Mut s phenotype) producing the recombinant Nef-Tat-His fusion protein, a myristylated 302 amino 
acids protein which is composed of: 

°Myristic acid 

30 °A methionine, created by the use of Ncol cloning site 

° 205a. a. of Nef p rote i restarting at a. a. 2 and extending to a. a. 206) 

° A threonine and a serine created by the cloning procedure 

°85a.a. of the Tat p rote i restarting at a. a. 2 and extending to a.a. 86) 

°A threonine and a serine introduced by the cloning procedure 
35 ° One glycine and six histidines 

3. EXPRESSION OF HIV-1 Tat-MUTANT IN PICHIA PASTORIS 

[0085] As well as a Nef -Tat mutant fusion protein, a mutant recombinant Tat protein has also been expressed. The 
^o mutant Tat protein must be biologically inactive while maintaining its immunogenic epitopes. 

[0086] A double mutant faf gene, constructed by D.Clements (Tulane University) was selected for these constructs. 
[0087] This faf gene (originates from BH1 0 molecular clone) bears mutations in the active site region (Lys41 -^Ala) 
and in RGD motif (Arg78^Lys and Asp80^Glu) ( Virology 235: 48-64, 1997). 

[0088] The mutant faf gene was received as a cDNA fragment subcloned between the EcoRI and Hind 1 1 1 sites within 
45 a CMV expression plasmid (pCMVLys41/KGE) 

3.1 CONSTRUCTION OF THE INTEGRATIVE VECTORS pRIT14912(encoding Tat mutant-His protein) and 
pRIT14913(encoding fusion Nef-Tat mutant-His). 

so [0089] The tat mutant gene was amplified by PCR from the pCMVLys41/KGE plasmid with primers 05 and 04 (see 
section 2.1 construction of pRIT14598) 

[0090] An Ncol restriction site was introduced at the 5' end of the PCR fragment while a Spel site was introduced at 
the 3' end with primer 04. The PCR fragment obtained and the PHIL-D2-MOD vector were both restricted by Ncol and 
Spel, purified on agarose gel and ligated to create the integrative plasmid pRIT14912 
55 [0091] To construct pRIT14913, the fa? mutant gene was amplified by PCR from the pCMVLys41/KGE plasmid with 
primers 03 and 04 (see section 1.3.1 construction of pRIT14596). 

[0092] The PCR fragment obtained and the plasmid pRIT14597 (expressing Nef-His protein) were both digested by 
Spel restriction enzyme, purified on agarose gel and ligated to create the integrative plasmid pRIT14913 
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3.2 TRANSFORMATION OF PICHIA PASTORIS STRAIN GS115. 



[0093] Pichia pastoris strains expressing Tat mutant-His protein and the fusion Nef-Tat mutant-His were obtained, by 
applying integration and recombinant strain selection strategies previously described in section 2.2 . 
[0094] Two recombinant strains producing Tat mutant-His protein ,a 95 amino-acids protein, were selected: Y1775 
(Mut + phenotype) and Y1776(Mut s phenotype). 

[0095] One recombinant strain expressing Nef-Tat mutant-His fusion protein, a 302 amino-acids protein was selected: 
Y1774(Mut + phenotype). 



4. PURIFICATION OF Nef-Tat-His FUSION PROTEIN (PICHIA PASTORIS) 



[0096] The purification scheme has been developed from 146g of recombinant Pichia pastoris cells (wet weight) or 
2L Dyno-mill homogenate OD 55. The chromatographic steps are performed at room temperature. Between steps , Nef- 
Tat positive fractions are kept overnight in the cold room (+4°C) ; for longer time, samples are frozen at -20° C. 



146g of Pichia pastoris cells 

I 

Homogenization 

I 

Dyno-mill disruption (4 passes) 

I 

Centrifugation 

I 

Dyno-mill Pellet 

I 

Wash 
(1h -4°C) 

I 

Centrifugation 

i 

Pellet 

I 

Solubilisation 
(O/N -4°C) 

I 

Reduction 
(4H - room temperature - in the dark) 

I 

Carboxymethylation 
(1/2 h - room temperature - in the dark) 

I 

Immobilized metal ion affinity 
chromatography on Ni ++ -NTA-Agarose 
(Qiagen - 30 ml of resin) 



Buffer: 2L 50 mM P0 4 pH 7.0 final OD:50 



JA 1 0 rotor / 9500 rpm/ 30 min / room temperature 



Buffer : +2L 1 0 mM P0 4 pH 7.5 - 
150mM - NaCI 0,5% empigen 



JA1 0 rotor / 9500 rpm/ 30 min / room 
temperature 



Buffer: + 660ml 1 0 mM P0 4 pH 7.5 - 
150mM NaCI - 4.0M GuHCI 

+ 0,2M 2-mercaptoethanesulfonic acid, 

sodium salt (powder addition) / pH adjusted to 7.5 (with 0,5M NaOH 
solution) before incubation 

+ 0,25M lodoacetamid (powder addition) 
/ PH adjusted to 7.5 (with 0,5M NaOH 
solution) before incubation 

Equilibration buffer : 10 mM P0 4 pH 7.5 -150mM NaCI - 4.0M GuHCI 



Washing buffer : 
buffer 

7.5 - 150mM 

7.5 - 150mM 
mM 



1 ) Equilibration 

2) 10 mM P0 4 pH 
NaCI - 6M Urea 

3) 10 mM P0 4 pH 
NaCI -6M Urea -25 
Imidazol 
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(continued) 



i 

Dilution 

I 

Cation exchange chromatography on SP 
Sepharose FF 
(Pharmacia - 30 ml of resin) 



I 

Concentration 



Elution buffer: 10 mM P0 4 pH 7.5 -150mM NaCI - 6M Urea - 0,5M 
Imidazol 

Down to an ionic strength of 1 8 mS/cm 2 
Dilution buffer : 1 0 mM P0 4 pH 7.5 - 6M Urea 

Equilibration buffer : 10 mM P0 4 pH 7.5 



Gel filtration chromatography on 
Superdex200 XK 
16/60 

(Pharmacia - 120 ml of resin) 

I 

Dialysis 
(O/N -4°C) 

I 

Sterile filtration 

ratio: 0,5M Arginin for a protein concentration of 1600jmg/ml. 



- 150mM NaCI - 6.0M Urea 

Washing buffer: 

buffer 

7.5 - 250mM 

Elution buffer : 10 mM Borate pH 9.0 - 
2M NaCI -6M Urea 

up to 5 mg/ml 

10kDa Omega membrane(Filtron) 
Elution buffer : 10 mM P0 4 pH 7.5- 



150mM NaCI - 6M Urea 
5 ml of sample / injection 5 injections 

Buffer : 10 mM P0 4 pH 6.8 - 150mM 
NaCI- 0,5M Arginin* 

Millex GV 0,22|jLm 



1 ) Equilibration 

2) 10 mM P0 4 pH 
NaCI - 6M Urea 



Purity 

[0097] The level of purity as estimated by SDS-PAGE is shown in Figure 4 by Daiichi Silver Staining and in Figure 5 
by Coomassie blue G250. 

After Superdex200 step: > 95% 

After dialysis and sterile filtration steps: > 95% 

Recovery 

[0098] 51 mg of Nef-Tat-his protein are purified from 146g of recombinant Pichia pastoris cells (= 2L of Dyno-mill 
homogenate OD 55) 

5. VACCINE PREPARATION 



[0099] A vaccine prepared in accordance with the invention comprises the expression product of a DNA recombinant 
encoding an antigen as exemplified in example 1 or 2 and as adjuvant, the formulation comprising a mixture of 3 de -O- 
acylated monophosphoryl lipid A 3D-MPL and QS21 in an oil/water emulsion. 

[0100] 3D-MPL: is a chemically detoxified form of the lipopolysaccharide (LPS) of the Gram-negative bacteria Sal- 
monella minnesota. 

[0101] Experiments performed at Smith Kline Beecham Biologicals have shown that 3D-MPL combined with various 
vehicles strongly enhances both the humoral and a TH1 type of cellular immunity. 

[01 02] QS21 : is one saponin purified from a crude extract of the bark of the Quillaja Saponaria Molina tree, which has 
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a strong adjuvant activity: it activates both antigen-specific lymphoproliferation and CTLs to several antigens. 
Experiments performed at Smith Kline Beecham Biologicals have demonstrated a clear synergistic effect of combinations 
of 3D-MPL and QS21 in the induction of both humoral and TH 1 type cellular immune responses. 
[0103] The oil/water emulsion is composed of 2 oils (a tocopherol and squalene), and of PBS containing Tween 80 
5 as emulsifier. The emulsion comprised 5% squalene 5% tocopherol 0.4% Tween 80 and had an average particle size 
of 1 80 nm (see WO 95/1 721 0). 

[01 04] Experiments performed at Smith Kline Beecham Biologicals have proven that the adjunction of this O/W emul- 
sion to 3D-MPL/QS21 further increases their immunostimulant properties. 

10 Preparation of the oil/water emulsion (2 fold concentrate) 

[0105] Tween 80 is dissolved in phosphate buffered saline (PBS) to give a 2% solution in the PBS. To provide 100ml 
two fold concentrate emulsion 5g of DL alpha tocopherol and 5ml of squalene are vortexed to mix thoroughly. 90ml of 
PBS/Tween solution is added and mixed thoroughly. The resulting emulsion is then passed through a syringe and finally 
15 microfluidised by using an M1 1 0S microfluidics machine. The resulting oil droplets have a size of approximately 1 80 nm. 

Preparation of oil in water formulation. 

[0106] Antigen prepared in accordance with example 1 or 2 (5|ULg) was diluted in 10 fold concentrated PBS pH 6.8 
20 and H 2 0 before consecutive addition of SB62, 3D-MPL (5|mg), QS21 (5|mg) and 50 img/ml thiomersal as preservative at 
5 min interval. The emulsion volume is equal to 50% of the total volume (50julI for a dose of 1 OOjulI). 
[0107] All incubations were carried out at room temperature with agitation. 

6. IMMUNOGENICITY OF Tat AND Nef-Tat IN RODENTS 

25 

[0108] Characterization of the immune response induced after immunization with Tat and NefTat was carried out. To 
obtain information on isotype profiles and cell-mediated immunity (CMI) two immunization experiments in mice were 
conducted. In the first experiment mice were immunized twice two weeks apart into the footpad with Tat or NefTat in 
the oxydized or reduced form, respectively. Antigens were formulated in an oil in water emulsion comprising squalene, 

30 tween 80™ (polyoxyethylene sorbitan monooleate) QS21 , 3D-MPL and oc-tocopherol, and a control group received the 
adjuvant alone. Two weeks after the last immunization sera were obtained and subjected to Tat-specific ELISA (using 
reduced Tat for coating) for the determination of antibody titers and isotypes (Figure 6a). The antibody titers were highest 
in the mice having received oxydized Tat. In general, the oxydized molecules induced higher antibody titers than the 
reduced forms, and Tat alone induced higher antibody titers than NefTat. The latter observation was confirmed in the 

35 second experiment. Most interestingly, the isotype profile of Tat-specific antibodies differed depending on the antigens 
used for immunization. Tat alone elicited a balanced IgGland lgG2a profile, while Neffat induced a much stronger T H2 
bias (Figure 6b). This was again confirmed in the second experiment. 

[0109] In the second mouse experiment animals received only the reduced forms of the molecules or the adjuvant 
alone. Besides serological analysis (see above) lymphoproliferative responses from lymph node cells were evaluated. 
^o After restimulation of those cells in vitro with Tat or NefTat 3 H-thymidine incorporation was measured after 4 days of 
culture. Presentation of the results as stimulation indices indicates that very strong responses were induced in both 
groups of mice having received antigen (Figure 7). 

[01 10] In conclusion, the mice studies indicate that Tat as well as Nef-Tat are highly immunogenic candidate vaccine 
antigens. The immune response directed against the two molecules is characterized by high antibody responses with 
45 at least 50% lgG1 . Furthermore, strong CMI responses (as measured by lymphoproliferation) were observed. 

7. FUNCTIONAL PROPERTIES OF THE Tat AND Nef-Tat PROTEINS 

[0111] The Tat and NefTat molecules in oxydized or reduced form were investigated for their ability to bind to human 
50 T cell lines. Furthermore, the effect on growth of those cell lines was assessed. ELISA plates were coated overnight 
with different concentration of the Tat and NefTat proteins, the irrelevant gD from herpes simplex virus type II, or with a 
buffer control alone. After removal of the coating solution HUT-78 cells were added to the wells. After two hours of 
incubation the wells were washed and binding of cells to the bottom of the wells was assessed microscopically. As a 
quantitative measure cells were stained with toluidine blue, lysed by SDS, and the toluidine blue concentration in the 
55 supernatant was determined with an ELISA plate reader. The results indicate that all four proteins, Tat and NefTat in 
oxydized or reduced form mediated binding of the cells to the ELISA plate (Figure 8). The irrelevant protein (data not 
shown) and the buffer did not fix the cells. This indicates that the recombinantly expressed Tat-containing proteins bind 
specifically to human T cell lines. 
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[0112] In a second experiment HUT-78 cells were left in contact with the proteins for 16 hours. At the end of the 
incubation period the cells were labeled with [ 3 H]-thymidine and the incorporation rate was determined as a measure 
of cell growth. All four proteins included in this assay inhibited cell growth as judged by diminished radioactivity incor- 
poration (Figure 9). The buffer control did not mediate this effect. These results demonstrate that the recombinant Tat- 
5 containing proteins are capable of inhibiting growth of a human T cell line. 

[01 13] In summary the functional characterization of the Tat and NefTat proteins reveals that these proteins are able 
to bind to human Tcell lines. Furthermore, the proteins are able to inhibit growth of such cell lines. 

SEQUENCE LISTING 

10 

[0114] 

(1) GENERAL INFORMATION 

15 (j) APPLICANT: SmithKline Beecham Biologicals S.A. 

(ii) TITLE OF THE INVENTION: Vaccine 

(iii) NUMBER OF SEQUENCES: 27 

(iv) CORRESPONDENCE ADDRESS: 

20 (A) ADDRESSEE: SmithKline Beecham 

(B) STREET: Two New Horizons Court 

(C) CITY: Brentford 

(D) STATE: 

(E) COUNTRY: Middx, UK 
25 (F) ZIP: TW8 9EP 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 
30 (B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

35 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 26-SEP-1997 

(C) CLASSIFICATION: 

40 (vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

45 (viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Bor, Fiona R 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/DOCKET NUMBER: 

50 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 0181 975 2817 

(B) TELEFAX: 0181 975 6141 
55 (C) TELEX: 

(2) INFORMATION FOR SEQ ID NO:1 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: 
ATCGTCCATG .GGT.GGC.A AG.TGG.T 28 

10 

(2) INFORMATION FOR SEQ ID NO:2: 
(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

CGGCTACTAG TGCAGTTCTT GAA 23 

(2) INFORMATION FOR SEQ ID NO:3: 

25 (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
ATCGTACTAG T. GAG. CCA. GTA.GAT.C 29 

35 (2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
45 CGGCTACTAG TTTCCTTCGG GCCT 24 

(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

50 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
ATCGTCCATG GAGCCAGTAG ATC 23 
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(2) INFORMATION FOR SEQ ID NO:6: 
(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 441 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 



ATGGATCCAA AAACTTTAGC CCTTTCTTTA TTAGCAGCTG GCGTACTAGC AGGTTGTAGC 60 
AGCCATTCAT CAAATATGGC GAATACCCAA ATGAAATCAG ACAAAATCAT TATTGCTCAC 120 
15 CGTGGTGCTA GCGGTTATTT ACCAGAGCAT ACGTTAGAAT CTAAAGCACT TGCTTTTGCA 180 

CAACAGGCTG ATTATTTAGA GCAAGATTTA GCAATGACTA AGGATGGTCG TTTAGTGGTT 24 0 
ATTCACGATC ACTTTTTAGA TGGCTTGACT GATGTTGCGA AAAAATTCCC ACATCGTCAT 300 
CGTAAAGATG GCCGTTACTA TGTCATCGAC TTTACCTTAA AAGAAATTCA AAGTTTAGAA 360 
ATGACAGAAA ACTTTGAAAC CATGGCCACG TGTGATCAGA GCTCAACTAG TGGCCACCAT 4 20 

CACCATCACC ATTAATCTAG A 441 

20 

(2) INFORMATION FOR SEQ ID NO:7: 
(i) SEQUENCE CHARACTERISTICS: 

25 

(A) LENGTH: 144 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 





Met 


Asp 


Pro 


Lys 


Thr 


Leu 


Ala 


Leu 


Ser 


Leu 


Leu 


Ala 


Ala 


Gly 


Val 


Leu 


35 


1 








5 










10 










15 






Ala 


Gly 


Cys 


Ser 
20 


Ser 


His 


Ser 


Ser 


Asn 
25 


Met 


Ala 


Asn 


Thr 


Gin 
30 


Met 


Lys 




Ser 


Asp 


Lys 
35 


lie 


He 


He 


Ala 


His 
40 


Arg 


Gly 


Ala 


Ser 


Gly 
45 


Tyr 


Leu 


Pro 


40 


Glu 


His 
50 


Thr 


Leu 


Glu 


Ser 


Lys 
55 


Ala 


Leu 


Ala 


Phe 


Ala 
60 


Gin 


Gin 


Ala 


Asp 




Tyr 


Leu 


Glu 


Gin 


Asp 


Leu 


Ala 


Met 


Thr 


Lys 


Asp 


Gly Arg 


Leu 


Val 


Val 




65 










70 










75 










80 




lie 


His 


Asp 


His 


Phe 


Leu 


Asp 


Gly 


Leu 


Thr 


Asp 


Val 


Ala 


Lys 


Lys 


Phe 


45 










85 










90 










95 




Pro 


His 


Arg 


His 


Arg 


Lys 


Asp 


Gly Arg 


Tyr 


Tyr 


Val 


He 


Asp 


Phe 


Thr 










100 










105 










110 








Leu 


Lys 


Glu 
115 


He 


Gin 


Ser 


Leu 


Glu 
120 


Met 


Thr 


Glu 


Asn 


Phe 
125 


Glu 


Thr 


Met 


50 


Ala 


Thr 


Cys 


Asp 


Gin 


Ser 


Ser 


Thr 


Ser 


Gly 


His 


His 


His 


His 


His 


His 




130 










135 










140 











(2) INFORMATION FOR SEQ ID NO:8: 
55 (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 



ATGGGTGGCA AGTGGTCAAA AAGTAGTGTG GTTGGATGGC 

AGACGAGCTG AGCCAGCAGC AGATGGGGTG GGAGCAGCAT 

GGAGCAATCA CAAGTAGCAA TACAGCAGCT ACCAATGCTG 

CAAGAGGAGG AGGAGGTGGG TTTTCCAGTC ACACCTCAGG 

TACAAGGCAG CTGTAGATCT TAGCCACTTT TTAAAAGAAA 

ATTCACTCCC AACGAAGACA AGATATCCTT GATCTGTGGA 

TTCCCTGATT GGCAGAACTA CACACCAGGG CCAGGGGTCA 

TGGTGCTACA AGCTAGTACC AGTTGAGCCA GATAAGGTAG 

AACACCAGCT TGTTACACCC TGTGAGCCTG CATGGAATGG 

TTAGAGTGGA GGTTTGACAG CCGCCTAGCA TTTCATCACG 

GAGTACTTCA AGAACTGCAC TAGTGGCCAC CATCACCATC 

(2) INFORMATION FOR SEQ ID NO:9: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 



Met 


Gly 


Gly 


Lys 


Trp 


Ser 


Lys 


Ser 


Ser 


Val 


Val 


Gly 


Trp 


Pro 


Thr 


Val 


1 








5 










10 










15 




Arg 


Glu 


Arg 


Met 
20 


Arg 


Arg 


Ala 


Glu 


Pro 
25 


Ala 


Ala 


Asp 


Gly 


Val 
30 


Gly 


Ala 


Ala 


Ser 


Arg 


Asp 


Leu 


Glu 


Lys 


His 


Gly Ala 


He 


Thr 


Ser 


Ser 


Asn 


Thr 






35 










40 










45 








Ala 


Ala 
50 


Thr 


Asn 


Ala 


Ala 


Cys 
55 


Ala 


Trp 


Leu 


Glu 


Ala 
60 


Gin 


Glu 


Glu 


Glu 


Glu 


Val 


Gly 


Phe 


Pro 


Val 


Thr 


Pro 


Gin 


Val 


Pro 


Leu 


Arg 


Pro 


Met 


Thr 


65 










70 










75 










80 


Tyr 


Lys 


Ala 


Ala 


Val 
85 


Asp 


Leu 


Ser 


His 


Phe 
90 


Leu 


Lys 


Glu 


Lys 


Gly 
95 


Gly 


Leu 


Glu 


Gly 


Leu 
100 


He 


His 


Ser 


Gin 


Arg 
105 


Arg 


Gin 


Asp 


He 


Leu 
110 


Asp 


Leu 


Trp 


He 


Tyr 
115 


His 


Thr 


Gin 


Gly 


Tyr 
120 


Phe 


Pro 


Asp 


Trp 


Gin 
125 


Asn 


Tyr 


Thr 


Pro 


Gly 
130 


Pro 


Gly 


Val 


Arg 


Tyr 
135 


Pro 


Leu 


Thr 


Phe 


Gly 
140 


Trp 


Cys 


Tyr 


Lys 


Leu 


Val 


Pro 


Val 


Glu 


Pro 


Asp 


Lys 


Val 


Glu 


Glu 


Ala 


Asn 


Lys 


Gly 


Glu 


145 










150 










155 










160 


Asn 


Thr 


Ser 


Leu 


Leu 
165 


His 


Pro 


Val 


Ser 


Leu 
170 


His 


Gly 


Met 


Asp 


Asp 
175 


Pro 


Glu 


Arg 


Glu 


Val 
180 


Leu 


Glu 


Trp 


Arg 


Phe 
185 


Asp 


Ser 


Arg 


Leu 


Ala 
190 


Phe 


His 


His 


Val 


Ala 
195 


Arg 


Glu 


Leu 


His 


Pro 
200 


Glu 


Tyr 


Phe 


Lys 


Asn 
205 


Cys 


Thr 


Ser 


Gly 


His 
210 


His 


His 


His 


His 


His 
215 





















CTACTGTAAG 
CTCGAGACCT 
CTTGTGCCTG 
TACCTTTAAG 
AGGGGGGACT 
TCTACCACAC 
GATATCCACT 
AAGAGGCCAA 
ATGACCCTGA 
TGGCCCGAGA 
ACCATTAA 



GGAAAGAATG 
GGAAAAACAT 
GCTAGAAGCA 
ACCAATGACT 
GGAAGGGCTA 
ACAAGGCTAC 
GACCTTTGGA 
TAAAGGAGAG 
GAGAGAAGTG 
GCTGCATCCG 
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(2) INFORMATION FOR SEQ ID NO:10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



ATGGAGCCAG TAGATCCTAG ACTAGAGCCC TGGAAGCATC CAGGAAGTCA GCCTAAAACT 60 

GCTTGTACCA ATTGCTATTG TAAAAAGTGT TGCTTTCATT GCCAAGTTTG TTTCATAACA 120 

AAAGCCTTAG GCATCTCCTA TGGCAGGAAG AAGCGGAGAC AGCGACGAAG ACCTCCTCAA 180 

GGCAGTCAGA CTCATCAAGT TTCTCTATCA AAGCAACCCA CCTCCCAATC CCGAGGGGAC 240 

CCGACAGGCC CGAAGGAAAC TAGTGGCCAC CATCACCATC ACCATTAA 288 



(2) INFORMATION FOR SEQ ID NO:1 1 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: 



Met 


Glu 


Pro 


Val 


Asp 


Pro 


Arg 


Leu 


Glu 


Pro 


Trp 


Lys 


His 


Pro 


Gly 


Ser 


1 








5 










10 










15 




Gin 


Pro 


Lys 


Thr 
20 


Ala 


Cys 


Thr 


Asn 


Cys 
25 


Tyr 


Cys 


Lys 


Lys 


Cys 
30 


Cys 


Phe 


His 


Cys 


Gin 
35 


Val 


Cys 


Phe 


He 


Thr 
40 


Lys 


Ala 


Leu 


Gly 


He 
45 


Ser 


Tyr 


Gly 


Arg 


Lys 


Lys 


Arg 


Arg 


Gin 


Arg 


Arg. Arg 


Pro 


Pro 


Gin 


Gly 


Ser 


Gin 


Thr 




50 










55 










60 










His 


Gin 


Val 


Ser 


Leu 


Ser 


Lys 


Gin 


Pro 


Thr 


Ser 


Gin 


Ser 


Arg 


Gly 


Asp 


65 










70 








75 










80" 


Pro 


Thr 


Gly 


Pro 


Lys 
85 


Glu 


Thr 


Ser 


Gly 


His 
90 


His 


His 


His 


His 


His 
95 





(2) INFORMATION FOR SEQ ID NO:12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 909 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 



EP 1 015 596 B1 



ATGGGTGGCA AGTGGTCAAA AAGTAGTGTG 
AGACGAGCTG AGCCAGCAGC AGATGGGGTG 
GGAGCAAT C A CAAGTAGCAA TACAGCAGCT 
CAAGAGGAGG AGGAGGTGGG TTTTCCAGTC 
TACAAGGCAG CTGTAGATCT TAGCCACTTT 
ATTCACTCCC AACGAAGACA AGATATCCTT 



GTTGGATGGC CTACTGTAAG GGAAAGAATG 
GGAGCAGCAT CTCGAGACCT GGAAAAACAT 
ACCAATGCTG CTTGTGCCTG GCTAGAAGCA 
ACACCTCAGG TACCTTTAAG ACCAATGACT 
TTAAAAGAAA AGGGGGGACT GGAAGGGCTA 
GATCTGTGGA TCTACCACAC ACAAGGCTAC 



TTCCCTGATT GGCAGAACTA CACACCAGGG 
TGGTGCTACA AGCTAGTACC AGTTGAGCCA 
AACACCAGCT TGTTACACCC TGTGAGCCTG 
TTAGAGTGGA GGTTTGACAG CCGCCTAGCA 
GAGTACTTCA AGAACTGCAC TAGTGAGCCA 
CCAGGAAGTC AGCCTAAAAC TGCTTGTACC 
TGCCAAGTTT GTTTCATAAC AAAAGCCTTA 
CAGCGACGAA GACCTCCTCA AGGCAGTCAG 
ACCTCCCAAT CCCGAGGGGA CCCGACAGGC 
CACCATTAA 



CCAGGGGTCA GATATCCACT GACCTTTGGA 
GATAAGGTAG AAGAGGCCAA TAAAGGAGAG 
CATGGAATGG ATGACCCTGA GAGAGAAGTG 
TTTCATCACG TGGCCCGAGA GCTGCATCCG 
GTAGATCCTA GACTAGAGCC CTGGAAGCAT 
AATTGCTATT GTAAAAAGTG TTGCTTTCAT 
GGCATCTCCT ATGGCAGGAA GAAGCGGAGA 
ACTCATCAAG TTTCTCTATC AAAGCAACCC 
CCGAAGGAAA CTAGTGGCCA CCATCACCAT 



(2) INFORMATION FOR SEQ ID NO:13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 



EP 1 015 596 B1 



Met 


Gly 


Gly 


Lys 


1 








Arg 


Glu 


Arg 


Met 








20 


Ala 


Ser 


Arg 


Asp 






35 




Ala 


Ala 


Thr 


Asn 




50 






Glu 


Val 


Gly 


Phe 


65 








Tyx 


Lys 


Ala 


Ala 


Leu 


Glu 


Gly 


Leu 








100 


Trp 


lie 


Tyr 


His 






115 




Pro 


Gly 


Pro 


Gly 




130 






Leu 


Val 


Pro 


Val 


145 








Asn 


Thr 


Ser 


Leu 


Glu 


Arg 


Glu 


Val 








180 


r_T ■ _ 

HIS 


Val 


Ala 


Arg 






195 




Glu 


Pro 


Val 


Asp 




210 






Pro 


Lys 


Thr 


Ala 


225 








Cys 


Gin 


Val 


Cys 


Lys 


Lys 


Arg 


Arg 








260 


Gin 


Val 


Ser 


Leu 



Trp Ser Lys Ser 
5^ 

Arg Arg Ala Glu 

Leu Glu Lys His 

40 

Ala Ala Cys Ala 
55 

Pro Val Thr Pro 
70 

Val Asp Leu Ser 
85 

lie His Ser Gin 

Thr Gin Gly Tyr 

120 

Val Arg Tyr Pro 
135 

Glu Pro Asp Lys 
150 

Leu His Pro Val 
165 

Leu Glu Trp Arg 

Glu Leu His Pro 

200 

Pro Arg Leu Glu 
215 

Cys Thr Asn Cys 
230 

Phe lie Thr Lys 
245 

Gin Arg Arg Arg 
Ser Lys Gin Pro 



Ser 


Val 


Val 


Gly 




10 






Pro 


Ala 


Ala 


Asp 


25 








Gly 


Ala 


lie 


Thr 


Trp 


Leu 


Glu 


Ala 








60 


Gin 


Val 


Pro 


Leu 






75 




His 


Phe 


Leu 


Lys 




90 






Arg 


Arg 


Gin 


Asp 


105 








Phe 


Pro 


Asp 


Trp 


Leu 


Thr 


Phe 


Gly 








140 


Val 


Glu 


Glu 


Ala 






155 




Ser 


Leu 


His 


Gly 




170 






Phe 


Asp 


Ser 


Arg 


185 








Glu 


Tyr 


Phe 


Lys 


Pro 


Trp 


Lys 


His 








220 


Tyr 


Cys 


Lys 


Lys 






235 




Ala 


Leu 


Gly 


He 




250 






Pro 


Pro 


Gin 


Gly 


265 








Thr 


Ser 


Gin 


Ser 



Trp 


Pro 


Thr 


Val 






15 




Gly 


Val 


Gly 


Ala 




30 






Ser 


Ser 


Asn 


Thr 


45 








Gin 


Glu 


Glu 


Glu 


Ara 


Pro 


Met 


Thr 








80 


Glu 


Lys 


Gly 


Gly 






95 




He 


Leu 


Asp 


Leu 




110 






Gin 


Asn 


Tyr 


Thr 


125 








Trp 


Cys 


Tyr 


Lvs 


Asn 


Lvs 


Gly 


Glu 








160 


Met 


Asp 


Asp 


Pro 






175 




Leu 


Ala 


Phe 


His 




190 






Asn 


Cys 


Thr 


Ser 


205 








Pro 


Gly 


Ser 


Gin 


Cys 


Cys 


Phe 


His 








240 


Ser 


Tyr 


Gly 


Arg 






255 




Ser 


Gin 


Thr 


His 




270 






Arg 


Gly 


Asp 


Pro 



Thr 



Gly 
290 



275 
Pro 



Lys Glu Thr 



Ser 
295 



280 
Gly 



His His His 



His 
300 



285 
His 



His 



(2) INFORMATION FOR SEQ ID NO:14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1029 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
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EP 1 015 596 B1 



ATGGATCCAA AAACTTTAGC CCTTTCTTTA TTAGCAGCTG GCGTACTAGC AGGTTGTAGC 60 

AGCCATTCAT CAAATATGGC GAATACCCAA AT GAAATC AG ACAAAATCAT TATTGCTCAC 120 

CGTGGTGCTA GCGGTTATTT ACCAGAGCAT ACGTTAGAAT CTAAAGCACT TGCTTTTGCA 180 

CAACAGGCTG ATTATTTAGA GCAAGATTTA GCAATGACTA AGGATGGTCG TTTAGTGGTT 240 

ATTCACGATC ACTTTTTAGA TGGCTTGACT GATGTTGCGA AAAAATTCCC ACATCGTCAT 300 

CGTAAAGATG GCCGTTACTA TGTCATCGAC TTTACCTTAA AAGAAAT T C A AAGTTTAGAA 360 

AT G AC AG AAA ACTTTGAAAC CATGGGTGGC AAGTGGTCAA AAAGTAGTGT GGTTGGATGG 420 

CCTACTGTAA GGGAAAGAAT GAGACGAGCT GAGCCAGCAG CAGATGGGGT GGGAGCAGCA 4 80 

TCTCGAGACC TGGAAAAACA TGGAGCAATC ACAAGTAGCA ATACAGCAGC TACCAATGCT 54 0 

GCTTGTGCCT GGCTAGAAGC ACAAGAGGAG GAGGAGGTGG GTTTTCCAGT CACACCTCAG 600 

GTACCTTTAA GACCAATGAC TTACAAGGCA GCTGTAGATC TTAGCCACTT TTTAAAAGAA 660 

AAGGGGGGAC TGGAAGGGCT AATTCACTCC CAACGAAGAC AAGATATCCT TGATCTGTGG 720 

ATCTACCACA CACAAGGCTA CTTCCCTGAT TGGCAGAACT ACACACCAGG GCCAGGGGTC 780 

AGATATCCAC TGACCTTTGG ATGGTGCTAC AAGCTAGTAC CAGTTGAGCC AGATAAGGTA 840 

GAAGAGGCCA ATAAAGGAGA GAACACCAGC TTGTTACACC CTGTGAGCCT GCATGGAATG 900 

GATGACCCTG AGAGAGAAGT GTTAGAGTGG AGGTTTGACA GCCGCCTAGC ATTTCATCAC 960 

GTGGCCCGAG AGCTGCATCC GGAGTACTTC AAGAACTGCA CTAGTGGCCA CCATCACCAT 1020 

CACCATTAA 1029 



(2) INFORMATION FOR SEQ ID NO:15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 325 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 



Cys 


Ser 


Ser 


His 


Ser 


Ser 


Asn 


Met 


Ala 


Asn 


Thr 


Gin 


Met 


Lys 


Ser 


Asp 


1 








5 










10 










15 




Lys 


lie 


He 


He 


Ala 


His 


Arg 


Gly 


Ala 


Ser 


Gly 


Tyr 


Leu 


Pro 


Glu 


His 






20 










25 










30 






Thr 


Leu 


Glu 


Ser 


Lys 


Ala 


Leu 


Ala 


Phe 


Ala 


Gin 


Gin 


Ala 


Asp 


Tyr 


Leu 






35 








40 










45 








Glu 


Gin 


Asp 


Leu 


Ala 


Met 


Thr 


Lys 


Asp 


Gly 


Arg 


Leu 


Val 


Val 


He 


His 




50. 








55 










60 










Asp 


His 


Phe 


Leu 


Asp 


Gly 


Leu 


Thr 


Asp 


Val 


Ala 


Lys 


Lys 


Phe 


Pro 


His 


65 








70 










75 










80 


Arg 


His 


Arg 


Lys 


Asp 


Gly 


Arg 


Tyr 


Tyr 


Val 


He 


Asp 


Phe 


Thr 


Leu 


Lys 








85 










90 










95 





EP 1 015 596 B1 



Glu 


lie 


Gin 


Ser 








100 


Lys 


Trp 


Ser 


Lys 






115 




Met 


Arg 


Ara 


Ala 




130 






Asp 


Leu 


Glu 


Lvs 


145 








Asn 


Ala 


Ala 


Cvs 


Phe 


Pro 


Val 


Thr 








180 


Ala 


Val 


Asp 


Leu 






195 




Leu 


lie 


His 


Ser 




210 






His 


Thr 


Gin 


Glv 


225 

«v «m 








Glv 

\j j- y 


Val 


Ara 


T vr 


Val 


Glu 


Pro 


no w 








? fin 


Leu 


Leu 


His 


Pro 






275 




Val 


Leu 


Glu 


Trp 




290 






Arg 


Glu 


Leu 


His 


305 








His 


His 


His 


His 



Leu Glu Met Thr 

Ser Ser Val Val 

120 

Glu Pro Ala Ala 
135 

His Gly Ala He 
150 

Ala Trp Leu Glu 
165 

Pro Gin Val Pro 

Ser His Phe Leu 

200 

Gin Arg Arg Gin 
215 

Tyr Phe Pro Asp 
230 

Pro Leu Thr Phe 
245 

Lys Val Glu Glu 

Val Ser Leu His 

280 

Arg Phe Asp Ser 
295 

Pro Glu Tyr Phe 
310 



Glu Asn Phe Glu 
105 

Gly Trp Pro Thr 

Asp Gly Val Gly 

140 

Thr Ser Ser Asn 
155 

Ala Gin Glu Glu 
170 

Leu Arg Pro Met 
185 

Lys Glu Lys Gly 

Asp He Leu Asp 

220 

Trp Gin Asn Tyr 
235 

Gly Trp Cys Tyr 
250 

Ala Asn Lys Gly 
265 

Gly Met Asp Asp 

Arg Leu Ala Phe 

300 

Lys Asn Cys Thr 
315 



Thr Met Gly Gly 
110 

Val Arg Glu Arg 
125 

Ala Ala Ser Arg 

Thr Ala Ala Thr 

160 

Glu Glu Val Gly 
175 

Thr Tyr Lys Ala 
190 

Gly Leu Glu Gly 
205 

Leu Trp lie Tyr 

Thr Pro Gly Pro 

240 

Lys Leu Val Pro 
255 

Glu Asn Thr Ser 
270 

Pro Glu Arg Glu 
285 

His His Val Ala 

Ser Gly His His 

320 



(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 



20 



EP 1 015 596 B1 



ATGGATCCAA AAACTTTAGC CCTTTCTTTA 
AGCCATTCAT CAAATATGGC GAATACCCAA 
CGTGGTGCTA GCGGTTATTT ACCAGAGCAT 
CAACAGGCTG ATTATTTAGA GCAAGATTTA 
ATTCACGATC ACTTTTTAGA TGGCTTGACT 
CGTAAAGATG GCCGTTACTA TGTCATCGAC 
AT G AC AG AAA ACTTTGAAAC CATGGGTGGC 
CCTACTGTAA GGGAAAGAAT GAGACGAGCT 
TCTCGAGACC TGGAAAAACA TGGAGCAATC 
GCTTGTGCCT GGCTAGAAGC ACAAGAGGAG 
GTACCTTTAA GACCAATGAC TTACAAGGCA 
AAGGGGGGAC TGGAAGGGCT AATTCACTCC 
ATCTACCACA CACAAGGCTA CTTCCCTGAT 
AGATATCCAC TGACCTTTGG ATGGTGCTAC 
GAAGAGG CC A ATAAAGGAGA GAACACCAGC 



TTAGCAGCTG GCGTACTAGC AGGTTGTAGC 
ATGAAATCAG AC AAAAT CAT TATTGCTCAC 
ACGTTAGAAT CTAAAGCACT TGCGTTTGCA 
GC AAT G ACT A AGGATGGTCG TTTAGTGGTT 
GATGTTGCGA AAAAATTCCC AC AT C G T CAT 
TTTACCTTAA AAGAAATTCA AAGTTTAGAA 
AAGTGGTCAA AAAGTAGTGT GGTTGGATGG 
GAGCCAGCAG CAGATGGGGT GGGAGCAGCA 
ACAAGTAGCA ATACAGCAGC TACCAATGCT 
GAGGAGGTGG GTTTTCCAGT CACACCTCAG 
GCTGTAGATC TTAGCCACTT TTTAAAAGAA 
CAACGAAGAC AAGATATCCT TGATCTGTGG 
TGGCAGAACT ACACACCAGG GCCAGGGGTC 
AAGCTAGTAC CAGTTGAGCC AGATAAGGTA 
TTGTTACACC CTGTGAGCCT GCATGGAATG 



GATGACCCTG AG AG AG AAG T 
GTGGCCCGAG AGCTGCATCC 
AGACTAGAGC CCTGGAAGCA 
TGTAAAAAGT GTTGCTTTCA 
TATGGCAGGA AGAAGCGGAG 
GTTTCTCTAT CAAAGCAACC 
ACTAGTGGCC ACCATCACCA 



GTTAGAGTGG AGGTTTGACA 
GGAGTACTTC AAGAACTGCA 
TCCAGGAAGT CAGCCTAAAA 
TTGCCAAGTT TGTTTCATAA 
ACAGCGACGA AGACCTCCTC 
CACCTCCCAA TCCCGAGGGG 
T C AC CAT T AA 



GCCGCCTAGC ATTTCATCAC 
CTAGTGAGCC AGTAGATCCT 
CTGCTTGTAC CAATTGCTAT 
CAAAAGCCTT AGGCATCTCC 
AAGGCAGTCA GACTCATCAA 
ACCCGACAGG CCCGAAGGAA 



(2) INFORMATION FOR SEQ ID NO:17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 412 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 
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Cys Ser Ser His 
1 

Lys lie lie lie 

20 

Thr Leu Glu Ser 
35 

Giu Gin Asp Leu 
50 

Asp His Phe Leu 
65 

Arg His Arg Lys 

Giu lie Gin Ser 

100 

Lys Trp Ser Lys 
115 

Met Arg Arg Ala 
130 

Asp Leu Glu Lys 
145 

Asn Ala Ala Cys 

Phe Pro Val Thr 

180 

Ala Val Asp Leu 
195 

Leu lie His Ser 
210 

His Thr Gin Gly 
225 

Gly Val Arg Tyr 

Val Glu Pro Asp 

260 

Leu Leu His Pro 
275 

Val Leu Glu Trp 
290 



Ser Ser Asn Met 
5 

Ala His Arg Gly 

Lys Ala Leu Ala 

40 

Ala Met Thr Lys 
55 

Asp Gly Leu Thr 
70 

Asp Gly Arg Tyr 
85 

Leu Glu Met Thr 

Ser Ser Val Val 

120 

Glu Pro Ala Ala 
135 

His Gly Ala lie 
150 

Ala Trp Leu Glu 
165 

Pro Gin Val Pro 

Ser His Phe Leu 

200 

Gin Arg Arg Gin 
215 

Tyr Phe Pro Asp 
230 

Pro Leu Thr Phe 
245 

Lys Val Glu Glu 

Val Ser Leu His 

280 

Arg Phe Asp Ser 
295 



Ala 


T\ aa M 

Asn 


Thr 


Gin 




1U 






Ala 


Ser 


Gly 


Tyr 










t_ 

Phe 


Ala 


Gin 


Gin 


Asp 


Gly 


Arg 


Leu 








60 


Asp 


Val 


Ala 


Lys 






"7 R 




Tyr 


Val 


lie 


Asp 




90 






Glu 


Asn 


Phe 


Glu 


105 








Gly 


Trp 

« 


Pro 


Thr 


Asp 


Gly 


Val 


Gly 








140 


Thr 


Ser 


Ser 


Asn 






1 J J 




ax a 


um 


bill 


bill 




i / u 






Leu 


7\ i_r~ s^m 

Arg 


pro 


Met 


1 oc 








Lys 


GlU 


Lys 


Gly 


ASp 


lie 


lieu 










220 


Trp 


Gin 


Asn 


Tyr 










Gly 


Trp 


Cys 


Tyr 




250 






Ala 


Asn 


Lys 


Gly 


265 








Gly 


Met 


Asp 


Asp 


Arg 


Leu 


Ala 


Phe 



300 



Met 


Lys 


Ser 


Asp 






15 




Leu 


Pro 


Glu 


f T 1 

His 




30 






Ala 


Asp 


Tyr 


Leu 


4 5 








Val 


Val 


He 


His 


Lys 


Phe 


Pro 


His 








80 


Phe 


Thr 


Leu 


Lys 






95 




Thr 


Met 


Gly Gly 




110 






Val 


Arg 


Glu Arg 


125 








TV 1 

Ala 


Ala 


Ser 


Arg 


Thr 


Ala 


Ala 


Thr 








160 


Glu 


Glu 


Val 


Gly 






175- 




Thr 


Tyr 


Lys 


Ala 




190 






Gly 


Leu 


Glu 


Gly 


205 








T a . • 

Leu 


Trp 


He 


Tyr 


\k 

Tnr 


Pro 


Gly 


Pro 








240 


Lys 


Leu 


Val 


Pro 






255 




Glu 


Asn 


Thr 


Ser 




270 






Pro 


Glu 


Arg 


Glu 


285 








His 


His 


Val 


Ala 



Arg 


Glu 


Leu 


His 


Pro 


Glu 


Tyr 


Phe 


Lys 


Asn 


Cys 


Thr 


Ser 


Glu 


Pro 


Val 


305 










310 










315 










320 


Asp 


Pro 


Arg 


Leu 


Glu 


Pro 


Trp 


Lys 


His 


Pro 


Gly 


Ser 


Gin 


Pro 


Lys 


Thr 










325 










330 










335 




Ala 


Cys 


Thr 


Asn 


Cys 


Tyr 


Cys 


Lys 


Lys 


Cys 


Cys 


Phe 


His 


Cys 


Gin 


Val 








340 










345 










350 






Cys 


Phe 


He 


Thr 


Lys 


Ala 


Leu 


Gly 


He 


Ser 


Tyr 


Gly 


Arg 


Lys 


Lys 


Arg 






355 










360 










365 








Arg 


Gin 


Arg 


Arg 


Arg 


Pro 


Pro 


Gin 


Gly 


Ser 


Gin 


Thr 


His 


Gin 


Val 


Ser 




370 










375 










380 










Leu 


Ser 


Lys 


Gin 


Pro 


Thr 


Ser 


Gin 


Ser 


Arg 


Gly 


Asp 


Pro 


Thr 


Gly 


Pro 


385 










390 










395 










400 


Lys 


Glu 


Thr 


Ser 


Gly 


His 


His 


His 


His 


His 


His 













405 410 



(2) INFORMATION FOR SEQ ID NO:18: 



22 



EP 1 015 596 B1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 981 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 



ATGGATCCAA GCAGCCATTC AT CAAAT ATG GCGAATACCC AAATGAAATC AG AC AAAAT C 
ATTATTGCTC ACCGTGGTGC TAGCGGTTAT TTACCAGAGC ATACGTTAGA ATCTAAAGCA 
CTTGCGTTTG CACAACAGGC TGATTATTTA GAGCAAGATT TAGCAATGAC TAAGGATGGT 
CGTTTAGTGG TTATTCACGA TCACTTTTTA GATGGCTTGA CTGATGTTGC GAAAAAATTC 
CCACATCGTC ATCGTAAAGA TGGCCGTTAC TATGTCATCG ACTTTACCTT AAAAGAAATT 
CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGGTG GCAAGTGGTC AAAAAGTAGT 
GTGGTTGGAT GGCCTACTGT AAGGGAAAGA ATGAGACGAG CTGAGCCAGC AGCAGATGGG 
GTGGGAGCAG CATCTCGAGA CCTGGAAAAA CATGGAGCAA TCACAAGTAG CAATACAGCA 
GCTACCAATG CTGCTTGTGC CTGGCTAGAA GCACAAGAGG AGGAGGAGGT GGGTTTTCCA 
GTCACACCTC AGGTACCTTT AAG AC CAATG ACTTACAAGG CAGCTGTAGA TCTTAGCCAC 
TTTTTAAAAG AAAAGGGGGG ACTGGAAGGG CTAATTCACT CCCAACGAAG AC AAG AT AT C 
CTTGATCTGT GGATCTACCA CACACAAGGC TACTTCCCTG ATTGGCAGAA CTACACACCA 
GGGCCAGGGG T C AG AT AT C C ACTGACCTTT GGATGGTGCT ACAAGCTAGT ACCAGTTGAG 
CCAGATAAGG T AG AAG AGG C CAATAAAGGA GAGAACACCA GCTTGTTACA CCCTGTGAGC 
CTGCATGGAA TGGATGACCC TGAGAGAGAA GTGTTAGAGT GGAGGTTTGA CAGCCGCCTA 
GCATTTCATC ACGTGGCCCG AGAGCTGCAT CCGGAGTACT TCAAGAACTG CACTAGTGGC 
CACCATCACC AT C AC CAT T A A 



(2) INFORMATION FOR SEQ ID NO:19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 



Met Asp Pro Ser Ser His Ser Ser Asn Met Ala Asn Thr Gin Met 
15 10 15 
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C ^ r- 

OCX. 






lie 


Tl 0 
x its 


T 1 0 
lie 


Ala 
nld 


tlx S 


nxy 


Gly Ala 


oei 


Gl v 
ox y 


1 yi 


XiC u 


Prn 

x x 0 








20 










25 










30 






Glu 

O J* u 


His 


Thr 
x nx 


T. At 1 
XjC VI 


Gl 11 


Car 

JCl 


T wc 
Lyo 


r\X a 


JjcU 


Ala 


Phe 


Ala 
ax a 


Gin 

0 Xi 1 


Gin 

O X i 1 


Ala 

^»x a 








35 








40 










45 








Tvr 
xy x, 


50 


Glu 

\J X ix 


Gin 

O X 1 1 




XiG U 


Al a 
r^X d 

55 


Mar 

i its t. 


'P V> r 
X ill 


Lys 


Asp 


Gl v 
0 x y 

60 


rix y 


T .cii 1 

LX 


Val 


Val 


He 


His 


Aso 


His 

l A X w 


Phe 


ue u 


A en 


Gl v/ 
ox y 


T .011 

Xi w Li. 


Thr 


Asp 


Val 

v ax 


Ala 

nx a 


Lvs 

xj y «-> 


Lvs 

xj y 0 


Phe 


65 










70 










75 










80 


Pro 


His 


Arn 


His 

1XX 0 


Am 
rvx y 


T.uc 
ijy s 


A en 

AO 


Gl w 
ox y 


7\ v* <T 

nx y 


Tvr 

x y x 


Tvr 

x y x 


Val 

* ax 


He 

X X ^ 


Aso 


Phe 
95 


Thr 

^ 14X 


JjCU 


T. v^ 


Glu 

O X U 


Tip 
X X c 

100 


Gin 
uxu 




LeU 


OX U 


105 


Thr 


Glu 


J\ c n 


Php 


Gl 11 
oxu 

110 


Thr 

X iiX 




gi v 


Gl v 
oiy 


uyo 

115 


x rp 


Jul 


T \7C 

xjyS 


OCX 


Oel 

120 


Val 


Val 


Gly 


irp 


x 1VO 

125 


T* H y~ 
1 1 11 


Ua 1 

val 


IV V* /~T 

niy 


O X UL 


rij. y 

130 




Za r 
r\JL y 


A rot 
rvx y 


Ala 


Gl 11 
uXu 

135 




& 1 a 
nld 


Ala 


Asp 


Gl v/ 
01 y 

140 


Va 1 
v ax 


Gl v 
ox y 


Ala 
ax a 


Al a 
~»x a 




Arcf 


A^O 
n.«3 £«» 




Glu 
ox u 


Xj y 0 


nxo 


Gl \i 
oiy 


Ala 
nXa 


He 


Thr 


Car 

oei 


Cpr 

OCX 


I\ 0 ri 


Thr 

X i IX 


Ala 
rvx a 


145 










150 










155 

X «J -J 










160 


Ala 


Thr 


Asn 


Ala 


Al a 
/AX a 

165 

X \J «J 


v.* y 0 


TV 1 A 
M.1 a. 


1 rp 


lie IX 


Glu 

0 X u 

170 

X / \J 


Ala 

nx a 


Gl n 
win 


Gl n 

OX Lt 


Gl ti 

O X Li 


Gin 

UXU 

175 


Gin 

OlU 


Val 


Glv 

O Jm y 


t I1C 


Pro 

t X u 

180 


v ax 


TH t- 
l ill 


O r /*\ 
rxO 


bin 


val 

185 


Pro 


Leu 


A /~r 

Aiy 


ZrlO 


Mot- 
ile L_ 

190 


Thr 
1 ill 


iyr 




Al A 


ai a 
195 


V dl 


rVSp 


Leu 


C A V- 

oei 


u fi- 
nis 

200 


rne 


Leu 


Lys 


olU 


Lys 
205 


oi y 


oiy 


iieu 


O X u 


210 


T ah 


T 1 
lie 


Pic: 
tlx S 


Cor 

Oei 


bin 
215 


nxy 


7\ /^T 


Gin 


Asp 


lie 
220 


Leu 




Leu 


T r n 

irp 


He 


T Vr 


His 

a x 0 


Thr 

X I AX 


Gl n 


Gl \/ 
ox y 


1 yr 


tlic 


D vn 

no 


Asp 


Trp 


Gl r\ 

bin 


t\Z3 11 


x yr 


Thr 
X 111 


Prft 


225 










230 










235 

Cm mjt mt 










240 


Glv 


Pro 


Glv 


Val 

v ax 


/ix y 

245 
x ** v 


1 yr 


t I O 


T can 


*T*ln v 

x i lx 


Phe 

rue 

OCA 


Glv 
0 x y 


irp 


*-y5 


iyr 


T t/e 

i*y 0 
2 55 

X. J -J 


T .01 1 


v a x 


Prn 


Ua 1 
V dX 


Gin 
oxu 

260 


P -y r> 
i X O 




jjys 


\/o 1 

V ai 


olU 

265 


Glu 


Ala 


aSu 


iiys 


oiy 
270 


fin 
OlU 


ASIl 


IT^ V*s 


O 


Leu 
275 


Leu 


HXS 


Pro 


Val 


Ser 
280 


Leu 


His 


Gly 


k # x_ 

Met 


Asp 
285 


Asp 


Pro 


Glu 


Arg 


Glu 
290 


Val 


Leu 


Glu 


Trp 


Arg 
295 


Phe 


Asp 


Ser 


Arg 


Leu 
300 


Ala 


Phe 


His 


His 


Val 


Ala 


Arg 


Glu 


Leu 


His 


Pro 


Glu 


Tyr 


Phe 


Lys 


Asn 


Cys 


Thr 


Ser 


Gly 


305 










310 










315 










320 


His 


His 


His 


His 


His 
325 


His 























(2) INFORMATION FOR SEQ ID NO:20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1242 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
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ATGGATCCAA GCAGCCATTC AT C AAAT AT G 
ATTATTGCTC ACCGTGGTGC TAGCGGTTAT 
CTTGCGTTTG CACAACAGGC TGATTATTTA 
CGTTTAGTGG TTATTCACGA TCACTTTTTA 
CCACATCGTC ATCGTAAAGA TGGCCGTTAC 



GCGAATACCC AAATGAAATC AG AC AAAAT C 
TTACCAGAGC ATACGTTAGA ATCTAAAGCA 
GAGCAAGATT TAGCAATGAC TAAGGATGGT 
GATGGCTTGA CTGATGTTGC GAAAAAATTC 
TATGTCATCG ACTTTACCTT AAAAGAAATT 



CAAAGTTTAG AAATGACAGA AAACTTTGAA 
GTGGTTGGAT GGCCTACTGT AAGGGAAAGA 
GTGGGAGCAG CATCTCGAGA CCTGGAAAAA 
GCTACCAATG CTGCTTGTGC CTGGCTAGAA 
GTCACACCTC AGGTACCTTT AAGACCAATG 
TTTTTAAAAG AAAAGGGGGG ACTGGAAGGG 
CTTGATCTGT GGATCTACCA CACACAAGGC 
GGGCCAGGGG TCAGAT AT CC ACTGACCTTT 
CCAGATAAGG TAGAAGAGGC CAATAAAGGA 
CTGCATGGAA TGGATGACCC T GAG AG AG AA 
GCATTTCATC ACGTGGCCCG AGAGCTGCAT 
CCAGTAGATC CT AG AC T AGA GCCCTGGAAG 
ACCAATTGCT ATTGTAAAAA GTGTTGCTTT 
TTAGGCATCT CCTATGGCAG GAAGAAGCGG 
CAGACTCATC AAGTTTCTCT ATCAAAGCAA 
GGCCCGAAGG AAACTAGTGG CCACCATCAC 



ACCATGGGTG GCAAGTGGTC AAAAAGTAGT 
AT GAG ACGAG CTGAGCCAGC AGCAGATGGG 
CATGGAGCAA TCACAAGTAG CAATACAGCA 
GCACAAGAGG AGGAGGAGGT GGGTTTTCCA 
ACTTACAAGG CAGCTGTAGA TCTTAGCCAC 
CTAATTCACT CCCAACGAAG AC AAG AT AT C 
TACTTCCCTG ATTGGCAGAA CTACACACCA 
GGATGGTGCT ACAAGCTAGT ACCAGTTGAG 
GAGAACACCA GCTTGTTACA CCCTGTGAGC 
GTGTTAGAGT GGAGGTTTGA CAGCCGCCTA 
CCGGAGTACT TCAAGAACTG CACTAGTGAG 
CATCCAGGAA" GTCAGCCTAA AACTGCTTGT 
CATTGCCAAG TTTGTTTCAT AACAAAAGCC 
AGACAGCGAC GAAGACCTCC TCAAGGCAGT 
CCCACCTCCC AATCCCGAGG GGACCCGACA 
CATCACCATT AA 



(2) INFORMATION FOR SEQ ID NO:21 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
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Met 


Asp 

* 


Pro 


Ser 


Ser 


His 


Ser 


Ser 


Asn 


Met 


Ala 


Asn 


Thr 


Gin 


Met 


Lys 


1 








5 










10 










15 




Ser 


Asp 


Lys 


He 


lie 


He 


Ala 


His 


Arg 


Gly 


Ala 


Ser 


Gly 


Tyr 


Leu 


Pro 








20 










25 










30 






Glu 


His 


Thr 


Leu 


Glu 


Ser 


Lys 


Ala 


Leu 


Ala 


Phe 


Ala 


Gin 


Gin 


Ala 


Asp 






35 










40 










45 








Tyr 


Leu 


Glu 


Gin 


Asp 


Leu 


Ala 


Met 


Thr 


Lys 


Asp 


Gly 


Arg 


Leu 


Val 


Val 




50 










55 










60 










lie 


His 


Asp 


His 


Phe 


Leu 


Asp 


Gly 


Leu 


Thr 


Asp 


Val 


Ala 


Lys 


Lys 


Phe 


65 










70 










75 










80 


Pro 


His 


Arg 


His 


Arg 


Lys 


Asp 


Gly 


Arg 


Tyr 


Tyr 


Val 


He 


Asp 


Phe 


Thr 










85 










90 










95 




Leu 


Lys 


Glu 


He 


Gin 


Ser 


Leu 


Glu 


Met 


Thr 


Glu 


Asn 


Phe 


Glu 


Thr 


Met 








100 










105 










110 






Gly 


Gly 


Lys 


Trp 


Ser 


Lys 


Ser 


Ser 


Val 


Val 


Gly Trp 


Pro 


Thr 


Val 


Arg 






115 










120 










125 








Glu 


Arg 


Met 


Arg 


Arg 


Ala 


Glu 


Pro 


Ala 


Ala 


Asp 


Gly 


Val 


Gly 


Ala 


Ala 




130 










135 










140 










Ser 


Arg 


Asp 


Leu 


Glu 


Lys 


His 


Gly 


Ala 


He 


Thr 


Ser 


Ser 


Asn 


Thr 


Ala 


145 










150 










155 










160 


Ala 


Thr 


Asn 


Ala 


Ala 


Cys 


Ala 


Trp 


Leu 


Glu 


Ala 


Gin 


Glu 


Glu 


Glu 


Glu 










165 






170 










175 




Val 


Gly 


Phe 


Pro 


Val 


Thr 


Pro 


Gin 


Val 


Pro 


Leu 


Arg 


Pro 


Met 


Thr 


Tyr 








180 










185 










190 






Lys 


Ala 


Ala 


Val 


Asp 


Leu 


Ser 


His 


Phe 


Leu 


Lys 


Glu 


Lys 


Gly 


Gly 


Leu 






195 










200 










205 








Glu 


Gly 


Leu 


He 


His 


Ser 


Gin 


Arg 


Arg 


Gin 


Asp 


He 


Leu 


Asp 


Leu 


Trp 




210 










215 










220 










lie 


Tyr 


His 


Thr 


Gin 


Gly 


Tyr 


Phe 


Pro 


Asp 


Trp 


Gin 


Asn 


Tyr 


Thr 


Pro 



225 230 235 240 

Gly Pro Gly Val Arg Tyr Pro Leu Thr Phe Gly Trp Cys Tyr Lys Leu 

245 250 255 

Val Pro Val Glu Pro Asp Lys Val Glu Glu Ala Asn Lys Gly Glu Asn 

260 265 270 

Thr Ser Leu Leu His Pro Val Ser Leu His Gly Met Asp Asp Pro Glu 

275 280 285 

Arg Glu Val Leu Glu Trp Arg Phe Asp Ser Arg Leu Ala Phe His His 

290 295 300 

Val Ala Arg Glu Leu His Pro Glu Tyr Phe Lys Asn Cys Thr Ser Glu 
305 310 315 320 

Pro Val Asp Pro Arg Leu Glu Pro Trp Lys His Pro Gly Ser Gin Pro 

325 330 335 

Lys Thr Ala Cys Thr Asn Cys Tyr Cys Lys Lys Cys Cys Phe His Cys 

340 345 350 

Gin Val Cys Phe He Thr Lys Ala Leu Gly He Ser Tyr Gly Arg Lys 

355 360 365 

Lys Arg Arg Gin Arg Arg Arg Pro Pro Gin Gly Ser Gin Thr His Gin 

370 375 380 

Val Ser Leu Ser Lys Gin Pro Thr Ser Gin Ser Arg Gly Asp Pro Thr 
385 390 395 400 

Gly Pro Lys Glu Thr Ser Gly His His His His His His 

405 410 



(2) INFORMATION FOR SEQ ID NO:22: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 



ATGGAGCCAG TAGATCCTAG ACTAGAGCCC TGGAAGCATC C AGGAAGT C A GCCTAAAACT 60 

GCTTGTACCA ATTGCTATTG TAAAAAGTGT TGCTTTCATT GCCAAGTTTG TTTCATAACA 120 

GCTGCCTTAG GCATCTCCTA TGGCAGGAAG AAGCGGAGAC AGCGACGAAG ACCTCCTCAA 180 

GGCAGTCAGA CTCATCAAGT TTCTCTATCA AAGCAACCCA CCTCCCAATC CAAAGGGGAG 240 

CCGACAGGCC CGAAGGAAAC TAGTGGCCAC CATCACCATC ACCATTAA 288 



(2) INFORMATION FOR SEQ ID NO:23: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 



Met Glu Pro Val 
1 

Gin Pro Lys Thr 

20 

His Cys Gin Val 



Asp Pro Arg Leu 
5 

Ala Cys Thr Asn 
Cys Phe lie Thr 



Glu Pro Trp Lys 
10 

Cys Tyr Cys Lys 
25 

Ala Ala Leu Gly 



His Pro Gly Ser 
15 

Lys Cys Cys Phe 
30 

lie Ser Tyr Gly 



35 40 45 

Arg Lys Lys Arg Arg Gin Arg Arg Arg Pro Pro Gin Gly Ser Gin Thr 

50 55 60 

His Gin Val Ser Leu Ser Lys Gin Pro Thr Ser Gin Ser Lys Gly Glu 
65 70 75 80 

Pro Thr Gly Pro Lys Glu Thr Ser Gly His His His His His His 

8 5 90 95 



(2) INFORMATION FOR SEQ ID NO:24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 909 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
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ATGGGTGGCA AGTGGTCAAA AAGTAGTGTG GTTGGATGGC CTACTGTAAG GGAAAGAATG 60 

AGACGAGCTG AGCCAGCAGC AGATGGGGTG GGAGCAGCAT CTCGAGACCT GGAAAAACAT 120 

GGAGCAATCA CAAGTAGCAA TACAGCAGCT ACCAATGCTG CTTGTGCCTG GCTAGAAGCA 180 

CAAGAGGAGG AGGAGGTGGG TTTTCCAGTC ACACCTCAGG TACCTTTAAG ACCAATGACT 240 

TACAAGGCAG CTGTAGATCT TAGCCACTTT TTAAAAGAAA AGGGGGGACT GGAAGGGCTA 300 

ATTCACTCCC AACGAAGACA AGATATCCTT GATCTGTGGA TCTACCACAC ACAAGGCTAC 360 

TTCCCTGATT GGCAGAACTA C AC AC CAGGG CCAGGGGTCA GATATCCACT GACCTTTGGA 420 

TGGTGCTACA AGCTAGTACC AGTTGAGCCA GATAAGGTAG AAGAGGCCAA TAAAGGAGAG 480 

AACACCAGCT TGTTACACCC TGTGAGCCTG CATGGAATGG ATGACCCTGA GAGAGAAGTG 54 0 

TTAGAGTGGA GGTTTGACAG CCGCCTAGCA TTTCATCACG TGGCCCGAGA GCTGCATCCG 600 

GAGTACTTCA AGAACTGCAC TAGTGAGCCA GTAGATCCTA GACTAGAGCC CTGGAAGCAT 660 

CCAGGAAGTC AGCCTAAAAC TGCTTGTACC AATTGCTATT GTAAAAAGTG TTGCTTTCAT 720 

TGCCAAGTTT GTTTCATAAC AGCTGCCTTA GGCATCTCCT ATGGCAGGAA GAAGCGGAGA 780 

CAGCGACGAA GACCTCCTCA AGGCAGTCAG ACTCATCAAG TTTCTCTATC AAAGCAACCC 840 

ACCTCCCAAT CCAAAGGGGA GCCGACAGGC CCGAAGGAAA CTAGTGGCCA CCATCACCAT 900 

CACCATTAA 9° 9 



(2) INFORMATION FOR SEQ ID NO:25: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 



Met Gly Gly Lys 
1 

Arg Glu Arg Met 

20 

Ala Ser Arg Asp 
35 

Ala Ala Thr Asn 
50 

Glu Val Gly Phe 
65 



Trp Ser Lys Ser 
5 

Arg Arg Ala Glu 

Leu Glu Lys His 

40 

Ala Ala Cys Ala 
55 

Pro Val Thr Pro 
70 



Ser Val Val Gly 
10 

Pro Ala Ala Asp 
25 

Gly Ala lie Thr 

Trp Leu Glu Ala 

60 

Gin Val Pro Leu 
75 



Trp Pro Thr Val 
15 

Gly Val Gly Ala 
30 

Ser Ser Asn Thr 
45 

Gin Glu Glu Glu 

Arg Pro Met Thr 

80 
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Tyr 


T 

Lys 


Ala 


Ala 


Val 
85 


Asp 


Leu 


Ser 


His 


Phe 

90 


Leu 


-r — 
Lys 

» 


GlU 


Lys 


Gly 


<"* 1 mm 

Gly 


Leu 


Glu 


Gly 


Leu 
100 


He 


His 


Ser 


Gin 


Arg 
105 


Arg 


Gin 


Asp 


He 


T — - 

Leu 
110 


Asp 


Leu 


Trp 


He 


Tyr 
115 


His 


Thr 


Gin 


Gly 


Tyr 
120 


Phe 


Pro 


Asp 


Trp 


Gin 
125 


Asn 


Tyr 


rp V, v 

xnr 


Pro 


Gly 
130 


Pro 


Gly 


Val 


Arg 


Tyr 
135 


Pro 


Leu 


Thr 


Phe 


Gly 
140 


Trp 


Cys 


Tyr 


Lys 


Leu 


Val 


Pro 


Val 


Glu 


Pro 


Asp 


T - - — 

Lys 


Val 


Glu 


Glu 


Ala 


Asn 


Lys 


Gly 


Glu 


145 










150 










155 










160 


Asn 


Tnr 


C~* — 

Ser 


Leu 


Leu 


t_i • _ 

HIS 


T 

Pro 


Val 


Ser 


Leu 

170 


HIS 


Gly 


Met 


Asp 


Asp 
I/O 


fro 


Glu 


Arg 


Glu 


Val 
180 


Leu 


Glu 


Trp 


Arg 


Phe 
185 


Asp 


Ser 


Arg 


Leu 


Ala 
190 


Phe 


His 


His 


Val 


Ala 
195 


Arg 


Glu 


Leu 


His 


Pro 
200 


Glu 


Tyr 


Phe 


Lys 


Asn 
205 


Cys 


Thr 


Ser 


Glu 


Pro 
210 


Val 


Asp 


Pro 


Arg 


Leu 
215 


Glu 


Pro 


Trp 


Lys 


His 
220 


Pro 


Gly 


Ser 


Gin 


Pro 


Lys 


Thr 


Ala 


Cys 


Thr 


Asn 


Cys 


Tyr 


Cys 


Lys 


Lys 


Cys 


Cys 


Phe 


His 


225 










230 










235 










240 


Cys 


Gin 


Val 


Cys 


Phe 
245 


He 


Thr 


Ala 


Ala 


Leu 
250 


Gly 


lie 


Ser 


Tyr 


Gly 
255 


Arg 


Lys 


Lys 


Arg 


Arg 
260 


Gin 


Arg 


Arg 


Arg 


Pro 
265 


Pro 


Gin 


Gly 


Ser 


Gin 
270 


Thr 


His 


Gin 


Val 


Ser 
275 


Leu 


Ser 


Lys 


Gin 


Pro 
280 


Thr 


Ser 


Gin 


Ser 


Lys 
285 


Gly 


Glu 


Pro 


Thr 


Gly 
290 


Pro 


Lys 


Glu 


Thr 


Ser 
295 


Gly 


His 


His 


His 


His 
300 


His 


His 







(2) INFORMATION FOR SEQ ID NO:26: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

TTCGAAACCA TGGCCGCGGA CTAGTGGCCA CCATCACCAT CACCATTAAC GGAATTC 57 
(2) INFORMATION FOR SEQ ID NO:27: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Thr Ser Gly His His His His His His 
1 5 
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Claims 

I. A vaccine composition which comprises a protein comprising 

5 (a) an entire HIV Tat protein or Tat with a C terminal histidine tail, or a mutated Tat which has undergone deletion, 

addition or substitution of one amino acid, or a mutated Tat as defined by SEQ ID NO. 23 , linked to either (i) 
a protein or lipoprotein fusion partner or (ii) an entire HIV Nef protein or Nef with a C-terminal histidine tail, or 
Nef which has undergone deletion, addition or substitution of one amino acid; or 

(b) an entire HIV Nef protein or Nef with a C- terminal histidine tail, or Nef which has undergone deletion, addition 
10 or substitution of one amino acid, linked to either (i) a protein or lipoprotein fusion partner or (ii) an entire HIV 

Tat protein or Tat with a C terminal histidine tail, or a mutated Tat which has undergone deletion, addition or 
substitution of one amino acid, or a mutated Tat as defined by in SEQ ID NO. 23; or 

(c) an entire HIV Nef protein or Nef with aC- terminal histidine tail, or Nef which has undergone deletion, addition 
or substitution of one amino acid, linked to an entire HIV Tat protein or Tat with a C terminal histidine tail, or a 

15 mutated Tat which has undergone deletion, addition or substitution of one amino acid, or a mutated Tat as 

defined by as defined by in SEQ ID NO. 23, and a protein or lipoprotein fusion partner, 

in admixture with a pharmaceutically acceptable excipient. 

20 2. A composition as claimed in claim 1 comprising a Tat-Nef fusion protein or derivative thereof. 

3. A composition as claimed in claim 1 comprising a Nef-Tat fusion protein or derivative thereof. 

4. A composition as claimed in any one of claims 1 to 3 wherein the lipoprotein is Haemophilus Influenza B protein D 
25 or derivative thereof. 

5. A composition as claimed in claim 4 wherein the fusion partner comprises between 100-130 amino acid from the N 
terminal of Haemophilus Influenza B protein D. 

30 6. A composition as claimed in any one of Claims 1 to 5, wherein the Tat protein is fused to an HIV Nef protein and a 
fusion partner. 

7. A composition as claimed in any one of claims 1 to 6, wherein the protein has a Histidine tail. 

35 8. A composition as claimed in any one of claims 1 to 7 wherein the protein is a Nef-Tat fusion protein or derivative 
thereof and is carboxymethylated. 

9. A composition as claimed in any one of claims 1 to 8, additionally comprising an adjuvant. 

40 10. A composition as claimed in claim 9, wherein the adjuvant is a TH1 inducing adjuvant. 

I I . A composition as claimed in claim 9 or 1 0 which adjuvant comprises monophosphoryl lipid A or a derivative thereof 
such as 3 de-O-acylated monophosphoryl lipid A. 

45 12. A composition as claimed in any one of claims 9 to 1 1 additionally comprising a saponin adjuvant. 

1 3. A composition as claimed in claim 1 1 or claim 1 2 which additionally comprises an oil in water emulsion and tocopherol. 

14. A composition as claimed in any one of claims 9 to claim 13 wherein the adjuvant comprises 3D-MPL, QS21 and 
50 an oil in water emulsion of tocopherol, squalene and Tween 80™. 

15. A composition as claimed in any one of claims 1 to 14 further comprising HIV gp160 or its derivative gp120. 

16. A protein comprising an entire HIV Tat protein or Tat with a C terminal histidine tail, or a mutated Tat which has 
55 undergone deletion, addition or substitution of one amino acid, or a mutated Tat as defined by SEQ ID NO. 23, 

linked to an entire HIV Nef protein or or Nef with a C- terminal histidine tail, or Nef which has undergone deletion, 
addition or substitution of one amino acid, in Nef-Tat or Tat-Nef orientation. 
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1 7. A nucleic acid encoding a protein of claim 1 6. 

18. A host transformed with a nucleic acid of claim 17. 

5 19. A host as claimed in claim 1 8 wherein the host is either E. colior Pichia pastoris. 

20. A method of producing a protein of claim 16, comprising providing a host as claimed in claim 18 or 19, expressing 
said protein and recovering the protein. 

10 21. A method of preparing a protein comprising (a) an entire HIV Tat protein or Tat with a C terminal histidine tail, or a 
mutaled Tat which has undergone deletion, addition or substitution of one amino acid, or a mutated Tat as defined 
by as defined by SEQ ID NO. 23, linked to either (i) a protein or lipoprotein fusion partner or (ii) an entire HIV Nef 
protein or Nef with a C- terminal histidine tail, or Nef which has undergone deletion, addition or substitution of one 
amino acid; or (b) an entire HIV Nef protein or Nef with a C- terminal histidine tail, or Nef which has undergone 

15 deletion, addition or substitution of one amino acid, linked to either (i) a protein or lipoprotein fusion partner or (ii) 

an entire HIV Tat protein or Tat with a C terminal histidine tail, or a mutated Tat which has undergone deletion, 
addition or substitution of one amino acid, or a mutated Tat as defined by SEQ ID NO. 23; or (c) an entire HIV Nef 
protein or Nef with a C- terminal histidine tail, or Nef which has undergone deletion, addition or substitution of one 
amino acid, linked to an entire HIV Tat protein or Tat with a C terminal histidine tail, or a mutated Tat which has 

20 undergone deletion, addition or substitution of one amino acid, or a mutated Tat as defined by SEQ ID NO. 23, and 

a protein or lipoprotein fusion partner, in Pichia pastoris which method comprises the steps of transforming Pichia 
patoris with DNA encoding said protein, expressing said protein and recovering the protein. 

22. The method of claim 21 wherein the protein is a Nef-Tat fusion protein or derivative thereof and the method further 
25 comprises a carboxymethylation step performed on the expressed protein. 

23. A method of producing a vaccine, comprising admixing the protein from any one of claims 20 to 22 with a pharma- 
ceutically acceptable diluent. 

30 24. The method of claim 23 further comprising the addition of HIV gp 1 60 or its derivative gp1 20. 

25. The method of claims 20 to 24 further comprising the addition of an adjuvant, particularly a TH1 inducing adjuvant 

26. A Nef-Tat-His or a Nef Tat Mutant His protein or polynucleotide having the amino acid or DNA sequence shown in 
35 SEQ ID NOs. 12, 13, 16, 17, 20, 21, 24 or 25. 



Patentanspriiche 

^o 1. Impfstoffzusammensetzung, umfassend: 

(a) ein vollstandiges HIV Tat-Protein oder Tat mit einem C-terminalen Histidinschwanz oder ein mutiertes Tat, 
das eine Deletion, Addition oder Substitution einer Aminosaure erfahren hat, oder ein mutiertes Tat wie in SEQ 
ID NO: 23 definiert, entweder gebunden an (i) einen Protein- oder Lipoprotein-Fusionspartner oder (ii) das 

45 vollstandige HIV Nef-Protein oder Nef mit einem C-terminalen Histidinschwanz oder Nef, das eine Deletion, 

Addition oder Substitution einer Aminosaure erfahren hat; oder 

(b) ein vollstandiges HIV Nef-Protein oder Nef mit einem C-terminalen Histidinschwanz oder Nef, das eine 
Deletion, Addition oder Substitution einer Aminosaure erfahren hat, entweder gebunden an (i) einen Protein- 
oder Lipoprotein-Fusionspartner oder (ii) das vollstandige HIV Tat-Protein oder Tat mit einem C-terminalen 

50 Histidinschwanz oder ein mutiertes Tat, das eine Deletion, Addition oder Substitution einer Aminosaure erfahren 

hat, oder ein mutiertes Tat wie in SEQ ID NO: 23 definiert; oder 

(c) ein vollstandiges HIV Nef-Protein oder Nef mit einem C-terminalen Histidinschwanz oder Nef, das eine 
Deletion, Addition oder Substitution einer Aminosaure erfahren hat, gebunden an das vollstandige HIV Tat- 
Protein oder Tat mit einem C-terminalen Histidinschwanz oder ein mutiertes Tat, das eine Deletion, Addition 

55 oder Substitution einer Aminosaure erfahren hat, oder ein mutiertes Tat wie in SEQ ID NO: 23 definiert, und 

einen Protein- oder Lipoprotein-Fusionspartner, 

in einer Mischung mit einem pharmazeutisch annehmbaren Exzipienten. 
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2. Zusammensetzung gemaG Anspruch 1 , umfassend ein Tat-Nef-Fusionsprotein oder Derivat davon. 

3. Zusammensetzung gemaG Anspruch 1 , umfassend ein Nef-Tat-Fusionsprotein oder ein Derivat davon. 

4. Zusammensetzung gemaG irgendeinem der Anspruche 1 bis 3, wobei das Lipoprotein das Hamophilus Influenza 
B Protein D oder ein Derivat davon ist. 

5. Zusammensetzung gemaG Anspruch 4, wobei der Fusionspartner zwischen 100 bis 130 Aminosauren vom N- 
Terminus des Hamophilus Influenza B Proteins D umfaGt. 

6. Zusammensetzung gemaG irgendeinem der Anspruche 1 bis 5, wobei das Tat-Protein mit einem HIV Nef-Protein 
und einem Fusionspartner fusioniert ist. 

7. Zusammensetzung gemaG irgendeinem der Anspruche 1 bis 6, wobei das Protein einen Histidinschwanz hat. 

8. Zusammensetzung gemaG irgendeinem der Anspruche 1 bis 7, wobei das Protein ein Nef-Tat-Fusionsprotein oder 
ein Derivat davon ist und carboxymethyliert ist. 

9. Zusammensetzung gemaG irgendeinem der Anspruche 1 bis 8, zusatzlich umfassend ein Adjuvans. 

10. Zusammensetzung gemaG Anspruch 9, wobei das Adjuvans ein TH1 -induzierendes Adjuvans ist. 

1 1 . Zusammensetzung gemaG Anspruch 9 oder 1 0, wobei das Adjuvans Monophosphoryllipid A oder ein Derivat davon, 
wie 3-de-O-acyliertes Monophosphoryllipid A, umfaGt. 

12. Zusammensetzung gemaG einem der Anspruche 9 bis 1 1 , zusatzlich umfassend ein Saponin-Adjuvans. 

13. Zusammensetzung gemaG Anspruch 11 oder 12, welche zusatzlich eine 6l-in-Wasser-Emulsion und Tocopherol 
umfaGt. 

14. Zusammensetzung gemaG irgendeinem der Anspruche 9 bis 13, wobei das Adjuvans 3D-MPL, QS21 und eine Ol- 
in-Wasser-Emulsion von Tocopherol, Squalen und Tween 80™ umfaGt. 

15. Zusammensetzung gemaG irgendeinem der Anspruche 1 bis 14, weiterhin umfassend HIV gp160 oder dessen 
35 Derivat gp1 20. 

16. Protein, das das vollstandige HIV Tat-Protein oder Tat mit einem C-terminalen Histidinschwanz oder ein mutiertes 
Tat, das eine Deletion, Addition oder Substitution einer Aminosaure erfahren hat, oder ein mutiertes Tat wie in SEQ 
ID NO: 23 definiert, umfaGt, gebunden an das vollstandige HIV Nef-Protein oder Nef mit einem C-terminalen Histi- 

40 dinschwanz oder Nef, das eine Deletion, Addition oder Substitution einer Aminosaure erfahren hat, in Nef-Tat- oder 

Tat-Nef-Orientierung. 

17. Nukleinsaure, codierend das Protein gemaG Anspruch 16. 

45 18. Wirt, transformiert mit einer Nukleinsaure gemaG Anspruch 17. 

19. Wirt gemaG Anspruch 18, wobei der Wirt entweder E. coli oder Pichia pastoris ist. 

20. Verfahren zum Herstellen eines Proteins gemaG Anspruch 1 6, umfassend Bereitstellen eines Wirts gemaG Anspruch 
50 18 oder 19, Exprimieren des Proteins und Gewinnen des Proteins. 

21. Verfahren zum Herstellen eines Proteins umfassend (a) ein vollstandiges HIV Tat-Protein oder Tat mit einem C- 
terminalen Histidinschwanz oder ein mutiertes Tat, das eine Deletion, Addition oder Substitution einer Aminosaure 
erfahren hat, oder ein mutiertes Tat wie in SEQ ID NO: 23 definiert, entweder gebunden an (i) einen Protein- oder 

55 Lipoprotein-Fusionspartner oder (ii) das vollstandige HIV Nef-Protein oder Nef mit einem C-terminalen Histidin- 

schwanz oder Nef, das eine Deletion, Addition oder Substitution einer Aminosaure erfahren hat; oder (b) ein voll- 
standiges HIV Nef-Protein oder Nef mit einem C-terminalen Histidinschwanz oder Nef, das eine Deletion, Addition 
oder Substitution einer Aminosaure erfahren hat, entweder gebunden an (i) einen Protein- oder Lipoprotein-Fusi- 
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onspartner oder (ii) ein vollstandiges HIV Tat-Protein oder Tat mit einem C-terminalen Histidinschwanz oder ein 
mutiertes Tat, das eine Deletion, Addition oder Substitution einer Aminosaure erfahren hat, oder ein mutiertes Tat 
wie in SEQ ID NO: 23 definiert; oder (c) ein vollstandiges HIV Nef-Protein oder Nef mit einem C-terminalen Histi- 
dinschwanz oder Nef, das eine Deletion, Addition oder Substitution einer Aminosaure erfahren hat, gebunden an 
ein vollstandiges HIV Tat-Protein oder Tat mit einem C-terminalen Histidinschwanz oder ein mutiertes Tat, das eine 
Deletion, Addition oder Substitution einer Aminosaure erfahren hat, oder ein mutiertes Tat wie in SEQ ID NO: 23 
definiert, und einen Protein- oder Lipoprotein-Fusionspartner; in Pichia pastoris, wobei das Verfahren die Schritte 
der Transformierung von Pichia pastoris mit DNA, die das Protein codiert, Exprimieren des Proteins und Gewinnen 
des Proteins umfaGt. 

22. Verfahren gemaG Anspruch 21 , wobei das Protein ein Nef-Tat-Fusionsprotein oder ein Derivat davon ist, wobei das 
Verfahren weiterhin einen Carboxymethylierungsschritt umfaBt, der an dem exprimierten Protein durchgefuhrt wird. 

23. Verfahren zum Herstellen eines Impfstoffs, umfassend Mischen des Proteins gemaG irgendeinem der Anspruche 
15 20 bis 22 mit einem pharmazeutisch annehmbaren Verdunnungsmittel. 

24. Verfahren gemaG Anspruch 23, weiterhin umfassend die Zugabe von HIV gp1 60 oder dessen Derivat gp120. 

25. Verfahren gemaG Anspruch 20 bis 24, weiterhin umfassend die Zugabe eines Adjuvans, insbesondere eines TH1- 
20 induzierenden Adjuvans. 

26. Net-Tat-His oder Nef-Tat-Mutanten-His-Protein oder Polynukleotid mit der Aminosaure oder DNA-Sequenz, die in 
den SEQ ID NOs: 12, 13, 16, 17, 20, 21, 24 oder 25 gezeigt ist. 

25 

Revendications 

1. Composition de vaccin qui comprend une proteine comprenant : 

30 (a) une proteine Tat entiere du VIH ou une proteine Tat avec une queue histidine en C-terminal, ou une proteine 

Tat mutee qui a subi une deletion, addition ou substitution d'un acide amine, ou une proteine Tat mutee telle 
que definie par la SEQ ID NO : 23, liee soit a (i) une proteine ou un partenaire de fusion de lipoproteine soit a 
(ii) une proteine Nef entiere du VIH ou une proteine Nef avec une queue histidine en C-terminal, ou une proteine 
Nef qui a subi une deletion, addition ou substitution d'un acide amine ; ou 

35 (b) une proteine Nef entiere du VIH ou une proteine Nef avec une queue histidine en C-terminal, ou une proteine 

Nef qui a subi une deletion, addition ou substitution d'un acide amine, liee soit a (i) une proteine ou un partenaire 
de fusion de lipoproteine soit a (ii) une proteine Tat entiere du VIH ou une proteine Tat avec une queue histidine 
en C-terminal, ou une proteine Tat mutee qui a subi une deletion, addition ou substitution d'un acide amine, ou 
une proteine Tat mutee telle que definie par la SEQ ID NO : 23 ; ou 

40 (c) une proteine Nef entiere du VIH ou une proteine Nef avec une queue histidine en C-terminal, ou une proteine 

Nef qui a subi une deletion, addition ou substitution d'un acide amine, liee a une proteine Tat entiere du VIH 
ou une proteine Tat avec une queue histidine en C-terminal, ou une proteine Tat mutee qui a subi une deletion, 
addition ou substitution d'un acide amine, ou une proteine Tat mutee telle que definie par la SEQ ID NO : 23, 
et une proteine ou un partenaire de fusion de lipoproteine, 

45 

dans un melange avec un excipient pharmaceutiquement acceptable. 

2. Composition telle que definie dans la revendication 1 comprenant une proteine de fusion Tat-Nef ou un derive de 
celle-ci. 

50 

3. Composition telle que definie dans la revendication 1, comprenant une proteine de fusion Nef-Tat ou un derive de 
celle-ci. 

4. Composition telle que definie dans I'une quelconque des revendications 1 a 3, dans laquelle la lipoproteine est la 
55 proteine D d'Haemophilus Influenza B ou un derive de celle-ci. 

5. Composition telle que definie dans la revendication 4, dans laquelle le partenaire de fusion comprend entre 100 et 
130 acides amines du N-terminal de la proteine D d'Haemophilus Influenza B. 
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6. Composition telle que definie dans Tune quelconque des revendications 1 a 5, dans laquelle la proteine Tat est 
fusionnee a une proteine Nef du VIH et a un partenaire de fusion. 

7. Composition telle que definie dans Tune quelconque des revendications 1 a 6, dans laquelle la proteine a une queue 
5 histidine. 

8. Composition telle que definie dans Tune quelconque des revendications 1 a 7, dans laquelle la proteine est une 
proteine de fusion Nef-Tat ou un derive de celle-ci et est carboxymethylee. 

10 9. Composition telle que definie dans Tune quelconque des revendications 1 a 8, comprenant en outre un adjuvant. 

10. Composition telle que definie dans la revendication 9, dans laquelle I'adjuvant est un adjuvant induisant TH1 

1 1 . Composition telle que definie dans la revendication 9 ou 1 0, dans laquelle I'adjuvant comprend un monophosphoryl- 
15 lipide A ou un derive de celui-ci tel qu'un monophosphoryl-lipide A 3-de-O-acyle. 

12. Composition telle que definie dans Tune quelconque des revendications 9 a 1 1 , comprenant en outre un adjuvant 
a base de saponine. 

20 13. Composition telle que definie dans la revendication 11 ou la revendication 12 qui comprend en outre une emulsion 
d'huile dans I'eau et du tocopherol. 

14. Composition telle que definie dans Tune quelconque des revendications 9 a 13, dans laquelle I'adjuvant comprend 
du 3D-MPL, du QS21 et une emulsion d'huile dans I'eau de tocopherol, squalene et Tween 80™. 

25 

15. Composition telle que definie dans Tune quelconque des revendications 1 a 14, comprenant en outre gp160 du VIH 
ou son derive gp120. 

16. Proteine comprenant une proteine Tat entiere du VIH ou une proteine Tat avec une queue histidine en C-terminal, 
30 ou une proteine Tat mutee qui a subi une deletion, addition ou substitution d'un acide amine, ou une proteine Tat 

mutee telle que definie par la SEQ ID NO : 23, liee a une proteine Nef entiere du VIH ou une proteine Nef avec une 
queue histidine en C-terminal, ou une proteine Nef qui a subi une deletion, addition ou substitution d'un acide amine, 
dans une orientation Nef-Tat ou Tat-Nef. 

35 17. Acide nucleique codant pour une proteine selon la revendication 16. 

18. Hote transforme avec un acide nucleique selon la revendication 17. 

19. Hote tel que defini dans la revendication 18, dans lequel I'hote est soit coii soit Pichia pastoris. 

40 

20. Procede de production d'une proteine selon la revendication 16, comprenant la fourniture d'un hote tel que defini 
dans la revendication 18 ou 19, I'expression de ladite proteine et la recuperation de la proteine. 

21. Procede de preparation d'une proteine comprenant (a) une proteine Tat entiere du VIH ou une proteine Tat avec 
^5 une queue histidine en C-terminal, ou une proteine Tat mutee qui a subi une deletion, addition ou substitution d'un 

acide amine, ou une proteine Tat mutee telle que definie par la SEQ ID NO : 23, liee soit a (i) une proteine ou un 
partenaire de fusion de lipoproteine soit a (ii) une proteine Nef entiere du VIH ou une proteine Nef avec une queue 
histidine en C-terminal, ou une proteine Nef qui a subi une deletion, addition ou substitution d'un acide amine ; ou 
(b) une proteine Nef entiere du VIH ou une proteine Nef avec une queue histidine en C-terminal, ou une proteine 

50 Nef qui a subi une deletion, addition ou substitution d'un acide amine, liee soit a (i) une proteine ou un partenaire 

de fusion de lipoproteine soit a (ii) une proteine Tat entiere du VIH ou une proteine Tat avec une queue histidine 
en C-terminal, ou une proteine Tat mutee qui a subi une deletion, addition ou substitution d'un acide amine, ou une 
proteine Tat mutee telle que definie par la SEQ ID NO : 23 ; ou (c) une proteine Nef entiere du VIH ou une proteine 
Nef avec une queue histidine en C-terminal, ou une proteine Nef qui a subi une deletion, addition ou substitution 

55 d'un acide amine, liee a une proteine Tat entiere du VIH ou une proteine Tat avec une queue histidine en C-terminal, 

ou une proteine Tat mutee qui a subi une deletion, addition ou substitution d'un acide amine, ou une proteine Tat 
mutee telle que definie par la SEQ ID NO : 23, et une proteine ou un partenaire de fusion de lipoproteine, dans 
Pichia pastoris, lequel procede comprend les etapes de transformation de Pichia pastoris avec un ADN codant 
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pour ladite proteine, d'expression de ladite proteine et de recuperation de la proteine. 

22. Procede selon la revendication 21 , dans lequel la proteine est une proteine de fusion Nef-Tat ou un derive de celle- 
ci et le procede comprend en outre une etape de carboxymethylation executee sur la proteine exprimee. 

23. Procede de production d'un vaccin, comprenant le melange de la proteine selon Tune quelconque des revendications 
20 a 22 avec un diluant pharmaceutiquement acceptable. 

24. Procede selon la revendication 23, comprenant en outre I'addition de gp160 du VIH ou son derive gp120. 

25. Procede selon les revendications 20 a 24, comprenant en outre I'addition d'un adjuvant, en particulier d'un adjuvant 
induisant TH1 . 

26. Proteine Nef-Tat-His ou Nef-Tat-Mutant-His ou polynucleotide ayant la sequence d'acides amines ou d'ADN repre- 
sentee par les SEQ ID NO : 12, 13, 16, 17, 20, 21, 24 ou 25. 
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Figure 1 : A/ Map of plasmid pRIT14586 




B/ Coding sequence of the first 127 amino acids 
of protein D and multiple cloning site. The signal 
sequence is underlined. 

BamHI 

ATG GAT CCA AAA ACT TTA GCC CTT TCT TTA TTA GCA CCT GCC GTA CTA GCA GGT TGT AGC AGC 

Met Asp Pro Lys Thr Leu Ala Leu Ser Leu Leu Ala Ala Gly Val Leu Ala Gly Cys Ser Ser 

CAT TCA TCA AAT ATG GCG AAT ACC CAA ATG AAA TCA G AC AAA ATC ATT ATT GCT CAC CGT GGT 

His Ser Ser Asn Met Ala Asn Thr Gin Met Lys Ser Asp Lys He lie lie Ala His Arg Gly 

GCT AGC GGT TAT TTA CCA GAG CAT ACG TTA GAA TCT AAA GCA CTT GCT TTT GCA CAA CAG GCT 

Ala Ser Gly Tyr Leu Pro Glu His Thr Leu GIu Ser Lys Ala Leu Ala Phc Ala Gin Gin Ala 

GAT TAT TTA GAG CAA GAT TTA GCA ATG ACT AAG GAT GGT CGT TTA GTG GTT ATT CAC GAT CAC 

Asp Tyr Leu Glu Gin Asp Leu Ala Met Thr Lys Asp Gly Arg Leu Val Val He His Asp His 

TTT TTA GAT GGC TTG ACT GAT GTT GCG AAA AAA TTC CCA CAT CGT CAT CGT AAA GAT GGC CGT 

Phe Leu Asp Gly Leu Thr Asp Val Ala Lys Lys Phe Pro His Arg His Arg Lys Asp Gly Arg 

TAC TAT GTC ATC G AC TTT ACC TTA AAA GAA ATT CAA AGT TTA GAA ATG ACA GAA AAC TTT GAA 

Tyr Tyr ^al lie Asp Phe Thr Leu Lys Glu lie Gin Ser Leu Glu Met Thr Glu Asn Phe Glu 

Ncol Spel Xbal 

ACC ATG GCC ACG TGT GAT CAG AGC TCA ACT AGT GGA CAC CAT CAC CAT CAC CAT TAA TCT AGA 

Thr Met Ala Thr Cys Asp Gin Ser Ser Thr Ser Gly His His His His His His * 



The amino acid sequence of Figure 1 relates to Seq. ID no. 7 and the nucleic acid sequence of 
Figure 1 relates to Seq. ID. No. 6. 
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The DNA and amino acid sequences of Nef-His; Tat-His; Nef-Tat-His fusion and 
mutated Tat is illustrated. 

Pichia-expressed constructs (plain constructs) 
=> Nef- HIS 

DNA sequence (Seq. ID. No. 8) 

ATGGGTGGCAAGTGGTCAAAAAGTAGTGTGGTTGGATGGCCTACTGTAAGGGAAAGA 
ATGAGACGAGCTGAGCCAGCAGCAGATGGGGTGGGAGCAGCATCTCGAGACCTGGAA 
AAACATGGAGCAATCACAAGTAGCAATACAGCAGCTACCAATGCTGCTTGTGCCTGG 
CTAGAAGCACAAGAGGAGGAGGAGGTGGGTTTTCCAGTCACACCTCAGGTACCTTTA 
AGACCAATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGG 
GGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATC 
TACCACACACAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTC 
AGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGCCAGATAAG 
GTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCAT 
GGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 
TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCACTAGTGGC 
CACCATCACCATCACCATTAA 

Protein sequence(Seq. ID. No. 9) 



MGGKWS KS S WGWP TVRERMRRAE P AADGVG AAS RDLE KHGA ITS SNTAATNAACAW 
LEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWI 
YHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLH 
GMD.DPEREVLEWRFDSRLAFHHVARELHPEYFKNCTSGHHHHHH. 



=> Tat-HIS 

DNA sequence (Seq. ID. No. 1 0) 



ATGGAGCCAGTAGATCCTAGACTAGAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAA 
ACTGCTTGTACCAATTGCTATTGTAAAAAGTGTTGCTTTCATTGCCAAGTTTGTTTC 
ATAACAAAAGCCTTAGGCATCTCCTATGGCAGGAAGAAGCGGAGACAGCGACGAAGA 
CCTCCTCAAGGCAGTCAGACTCATCAAGTTTCTCTATCAAAGCAACCCACCTCCCAA 
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TCCCGAGGGGACCCGACAGGCCCGAAGGAAACTAGTGGCCACCATCACCATCACCAT 
TAA 

Protein sequence (Seq. ID. No. 11) 

MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKKRRQRRR 
PPQGSQTHQVSLSKQPTSQSRGDPTGPKETSGHHHHHH. 



=> Nef-Tat-HIS 

DNA sequence (Seq. ID. No, 12) 

ATGGGTGG CAAGTGGT CAAAAAGTAGTGTGGTTGGATGGCCTACTGTAAGGGAAAGA 
ATGAGACGAG CTGAG CCAGCAGCAGATGGGGTGGGAGCAGC ATCT CGAGAC CTGGAA 
AAACATGGAGCAATCACAAGTAGCAATACAGC^GCTACCAATGCTGCTTGTGCCTGG 
CTAGAAGCACAAGAGGAGGAGGAGGTGGGTTTTCCAGTCACACCTCAGGTACCTTTA 
AGACCAATGACTTACAAGGCAGCTGTAGAT CTTAGC CACTTTTTAAAAGAAAAGGGG 
GGACTGGAAGGGCTAATT CACTCCCAACGAAGACAAGATAT C CTTGAT CTGTGGATC 
TACCACACACAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTC 
AGATAT C CACTGACCTTTGGATGGTGCTACAAG CTAGTACCAGTTGAG CCAGATAAG 
GTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCAT 
GGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 
TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCACTAGTGAG 
CCAGTAGATCCTAGACTAGAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAAACTGCT 
TGTACCAATTGCTATTGTAAAAAGTGTTGCTTTCATTGCCAAGTTTGTTTCATAACA 
AAAGC CTTAGGCATCTCCTATGGCAGGAAGAAGCGGAGACAGCGACGAAGAC CT CCT 
CAAGGCAGTCAGACT CAT CAAGTTTCTCTATCAAAG CAACC C AC CT CC GAAT CC CGA 
GGGGACC CGACAGGC CCGAAGGAAACTAGTGG C CACCATCACCAT CAC CATTAA 

Protein sequence(Seq. ID. No. 13) 



MGGKWS KS S WGWPTVRERMRRAE P AADGVGAAS RDLEKHGA ITS SNTAATNAACAW 
LEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWI 
YHTQGYFPDWQNYTPGPGWYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLH 
GMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNCTSEPVDPRLEPWKHPGSQPKTA 

CTNCYCKKCCFHCQVCFITKALGISYGRKKRRQRRRPPQGSQTHQVSLSKQPTSQSR 
GDPTGPKETSGHHHHHH. 



E.coli-expressed constructs (fusion constructs) 
=> LipoD-Nef-HIS 
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£)NA sequence (Seq. ID. No. 14) 

Nucleotides corresponding to the Prot D Fusion Partner are in bold. 
The Lipidation Signal Sequence is underlined. After processing, the cysteine 
coded by the TGT codon, indicated with a star, becomes the amino terminal 
residue which is then modified by covalently bound fatty acids. 

* 

A TGGA TC CAAAAACTTTAGCCCTTTC TTTATTAGCAGCTGGCGTACTA GCAGGT TGT 
AGCAGCCATTCATCAAATATGGCGAATACCCAAATGAAATCAGACAAAATCATTATT 
GCTCACCGTGGTGCTAGCGGTTATTTACCAGAGCATACGTTAGAATCTAAAGCACTT 
GCTTTTGCACAACAGGCTGATTATTTAGAGCAAGATTTAGCAATGACTAAGGATGGT 
CGTTTAGTGGTTATTCACGATCACTTTTTAGATGGCTTGACTGATGTTGCGAAAAAA 
TTCCCACATCGTCATCGTAAAGATGGCCGTTACTATGTCATCGACTTTACCTTAAAA 
GAAATTCAAAGTTTAGAAATGACAGAAAACTTTGAAACCATGGGTGGCAAGTGGTCA 

AAAAGTAGTGTGGTTGGATGGCCTACTGTAAGGGAAAGAATGAGACGAGCTGAGCCA 
GCAGCAGATGGGGTGGGAGCAGCATCTCGAGACCTGGAAAAACATGGAGCAATCACA 
AGTAGCAATACAGCAGCTACCAATGCTGCTTGTGCCTGGCTAGAAGCACAAGAGGAG 
G AGG AGGTGGGTTTT C C AGT C AC AC C T C AGGT AC CT TTAAG AC C AAT G ACTT AC AAG 
GCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATT 
CACT CC CAACGAAGAC AAGATAT C CTTGAT CTGTGGAT CT AC CAC ACAC AAGG CT AC 
TTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTT 
GGATGGTGCTACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAA 
GGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGGATGACCCTGAG 
AGAGAAGTGTTAGAGTGG AGGTTTG ACAG C CG CCTAG C ATTT C AT CACGTGGCCCG A 
GAG CTG CAT C CGGAGTACTT CAAGAACTG CACTAGTGGC CACCAT CACCAT CAC CAT 
TAA 

Protein sequence of the processed lipidated ProtD-Nef-HIS protein (Seq. ID. No. 15) 
(Amino-acids corresponding to Prot D fusion partner are in bold) 

CS SHS SNMAOTQMKSDKI 1 1 AHRGASGYLPEHTLESKALAFAQQADYLEQDLAMTKD 
GRLWIHDHFLDGLTDVAKKFPHRHRKDGRYYVIDFTLKEIQSLE>n , ENFETMGGKW 

S KS S WGWPTVRERMRRAE PAADGVGAASRDLEKHGAI TS SNTAATNAACAWLEAQE 
EEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQG 
YFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLHGMDDP 

EREVLEWRFDSRLAFHHVARELHPEYFKNCTSGHHHHHH . 



=> LipoD-Nef-Tat-HIS 

DNA sequence (Seq. ID. No. 1 6) 
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Nucleotides corresponding to the Prot D Fusion Partner are in bold. 
The Lipidation Signal Sequence is underlined. After processing, the cysteine 
coded by the TGT codon, indicated with a star, becomes the amino terminal 
residue which is then modified by covalently bound fatty acids. 

ATGGATCCAAAAACTTTAGCCCTTTCTTTATTAGCAGCTGGCGTACTAGCAGGT TGT 
AGCAGCCATTCATCAAATATGGCGAATACCCAAATGAAATCAGACAAAATCATTATT 
GCTCACCGTGGTGCTAGCGGTTATTTACCAGAGCATACGTTAGAATCTAAAGCACTT 
GC GTTTGC AC AACAGG CTGATTATTTAGAGC AAGATTTAGCAATGACTAAGGATGGT 
CGTTTAGTGGTTATTCACGATCACTTTTTAGATGGCTTGACTGATGTTGCGAAAAAA 
TTCCCACATCGTCATCGTAAAGATGGCCGTTACTATGTCATCGACTTTACCTTAAAA 
GAAATTCAAAGTTTAGAAATGACAGAAAACTTTGAAACCATGGGTGGCAAGTGGTCA 

AAAAGTAGTGTGGTTGGATGGCCTACTGTAAGGGAAAGAATGAGACGAGCTGAGCCA 
G C AGCAGATGGGGTGGGAG CAG CAT CT CGAGACCTGGAAAAACATGG AG C AAT CA C A 
AGTAGCAATACAGCAGCTACCAATGCTGCTTGTGCCTGGCTAGAAGCACAAGAGGAG 
GAGGAGGTGGGTTTTCCAGTCACACCTCAGGTACCTTTAAGACCAATGACTTACAAG 
GCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATT 
CACT C C CAAC GAAG A C AAGAT AT C CTTGATCTGTGGAT CTAC C ACAC ACAAGGCTAC 
TTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTT 
GGATGGTGCTACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAA 
GGAGAGAACAC CAG CTTGT TAC AC C CTGTGAG CCTGCATGGAATGGATGAC C CTGAG 
AGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGA 
GAG CTGCAT C CGG AGT ACTT CAAGAACTGCAC TAGTGAGCC AGTAGATC CTAGACT A 
GAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAAACTGCTTGTACCAATTGCTATTGT 
AAAAAGTGTTGCTTTCATTGCCAAGTTTGTTTCATAACAAAAGCCTTAGGCATCTCC 
TATGGCAGGAAGAAGCGGAGACAGCGACGAAGACCTCCTCAAGGCAGTCAGACTCAT 
CAAGTTTCTC TAT CAAAG CAAC CCACCTCCC AAT CCCGAGGGGACCCGACAGGCCCG 
AAGGAAACTAGTGGCCACCATCACCATCACCATTAA 

Protein sequence of the processed lipidated ProtD-NEF-TAT-HIS protein (Seq. ID. No, 

m 

(Amino-acids corresponding to Prot D fusion partner are in bold) 

CSSHSSNMANTQMKSDKIIIAHRGASGYLPEHTLESKALAFAQQADYLEQDLAMTKD 
GRLWIHDHFLDGLTDVAKKFPHRHRKDGRYYVIDFTLKEIQSLEMTENFETMGGKW 

SKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAATNAACAWLEAQE 
EEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQG 
YFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLHGMDDP 
EREVLEWRFDSRIAFHHVARELHPEYFK^CTSEPVDPRLEPWKHPGSQPKTACTNCY 
CKKCCFHCQVCFITKALGISYGRKKRRQRRRPPQGSQTHQVSLSKQPTSQSRGDPTG 
PKETSGHHHHHH . 
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=> ProtD-Nef-HlS 

DNA sequence (Seq. ID. No. 18) 

Nucleotides corresponding to the Prot D Fusion Partner are in bold. 

ATGGATCCAAGCAGCCATTCATCAAATATGGCGAATACCCAAATGAAATCAGACAAA 
ATCATTATTGCTCACCGTGGTGCTAGCGGTTATTTACCAGAGCATACGTTAGAATCT 
AAAGCACTTGCGTTTGCACAACAGGCTGATTATTTAGAGCAAGATTTAGCAATGACT 
AAGGATGGTCGTTTAGTGGTTATTCACGATCACTTTTTAGATGGCTTGACTGATGTT 
GCGAAAAAATTCCCACATCGTCATCGTAAAGATGGCCGTTACTATGTCATCGACTTT 
ACCTTAAAAGAAATTCAAAGTTTAGAAATGACAGAAAACTTTGAAACCATGGGTGGC 

AAGTGGTCAAAAAGTAGTGTGGTTGGATGGCCTACTGTAAGGGAAAGAATGAGACGA 
G CTG AG CCAG CAG CAGATGGGGTGGGAG C AG CAT C T CG AG AC CTGGAAAAACATGGA 
GCAATCACAAGTAGCAATACAGCAGCTACCAATGCTGCTTGTGCCTGGCTAGAAGCA 
CAAGAGGAGGAGGAGGTGGGTTTTCCAGTCACACCTCAGGTACCTTTAAGACCAATG 
ACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAA 
GGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACA 
CAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCA 
CTGAC CTTTGGATGGTG CT ACAAG CTAGT AC C AGTTGAG C C AGATAAGGTAG AAGAG 
GCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGGAT 
GACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCAC 
GTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCACTAGTGGCCACCATCAC 
CATCACCATTAA 



Protein sequence (Seq. ID. No. 19) 

(Amino-acids corresponding to Prot D fusion partner are in bold) 

MDPSSHSSNMANTQMKSDKIIIAHRGASGYLPEHTLESKALAFAQQADYL 
EQDLAMTKDGRLVVIHDHFLDGLT 

EIQSLEMTENFETMGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRDL 
EKI1GAITSSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSH 
FLKEKGGLEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGW 
CYKLVPVEPDKVEEANKGENTSLLHPVSLHGMDDPEREVLEWRFDSRLAFH 
HVARELHPEYFKNCTSGHHHHHH . 



=> ProtD-Nef -Tat-HIS 

DNA sequence (Seq. ID. No. 20) 
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Nucleotides corresponding to the Prot D Fusion Partner are in bold. 

ATGGAT C C AAGC AGC CATTCATCAAATATGGC GAATAC C CAAATGAAATC AGAC AAA 
ATCATTATTGCTCACCGTGGTGCTAGCGGTTATTTACCAGAGCATACGTTAGAATCT 
AAAGCACTTGCGTTTGCACAACAGGCTGATTATTTAGAGCAAGATTTAGCAATGACT 
AAGGATGGTCGTTTAGTGGTTATTCACGATCACTTTTTAGATGGCTTGACTGATGTT 
GCGAAAAAATTCCCACATCGTCATCGTAAAGATGGCCGTTACTATGTCATCGACTTT 
ACCTTAAAAGAAATTCAAAGTTTAGAAATGACAGAAAACTTTGAAACCATGGGTGGC 

AAGTGGTCAAAAAGTAGTGTGGTTGGATGGCCTACTGTAAGGGAAAGAATGAGACGA 
G CTG AG C CAGC AGCAGATGGGGTGGGAGCAG CAT CTCGAGAC CTGGAAAAACATGG A 
GC AAT C AC AAGT AG CAAT ACAGCAGCT AC C AATG CTGCTTGTGCCTGG CTAGAAGC A 
CAAGAGGAGGAGGAGGTGGGTTTT C CAGT CACAC CTCAGGTAC CTTTAAGAC CAATG 
ACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAA 
GGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACA 
CAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCA 
CTGAC CTTTGGATGGTGCTACAAG CTAGT ACCAGTTGAGC CAG ATAAGGTAGAAGAG 
G C CAATAAAGGAGAGAACAC CAG CTTGTTACACC CTGTGAG C CTG CATGGAATGGAT 
GACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCAC 
GTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCACTAGTGAGCCAGTAGAT 
CCTAGACTAGAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAAACTGCTTGTACCAAT 
TGCTATTGTAAAAAGTGTTGCTTTCATTGCCAAGTTTGTTTCATAACAAAAGCCTTA 
GGCAT CT C CTATGG C AGGAAGAAGCGGAGACAGCGACGAAGAC CTCCTCAAGGCAGT 
CAGACTCATCAAGTTTCTCTATCAAAGCAACCCACCTCCCAATCCCGAGGGGACCCG 
ACAGG C C CGAAGGAAACTAGTGG C CAC CATCACCAT CACCATTAA 



Protein sequence (Seq. ID. No. 21) 

(Amino-acids corresponding to Prot D fusion partner are in bold) 

MDPSSHSSNMAOTQMKSDKIIIAHRGASGYLPEHTLESKAIAFAQQADYLEQDIiAMT 
KDGRLWIHDHFLDGLTDVAKKFPHRHRKDGRYYVIDFTLKEIQSLEMTENFETMGG 

KWS KS S WGWPTVRERMRRAE PAADGVGAASRDLEKHGAI TS SNTAATNAACAWLEA 
QEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHT 
QGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLHGMD 
DPEREVLEWRFDSRLAFHHVARELHPEYFKNCTSEPVDPRLEPWKHPGSQPKTACTN 
CYCKKCCFHCQVCFITKALGISYGRKKRRQRRRPPQGSQTHQVSLSKQPTSQSRGDP 
TGPKETSGHHHHHH . 



=> Tat-MUTANT-HIS 



DNA sequence (Seq. ID. No. 22) 
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ATGG AG C C AGTAGAT CCT AGACT AG AG CC CTGG AAG CAT C 4 0 

CAGGAAGTCAGCCTAAAACTGCTTGTACCAATTGCTATTG 8 0 

TAAAAAGTGTTGCTTT CATTG C C AAGTTT GTTT CATAACA 12 0 

GCTGC CTTAGGC AT CTC CT ATGG CAGG AAGAAG CGGAGAC 16 0 

AG CG ACGAAG AC CT C CT C AAGG C AG T C AG AC T CAT C AAGT 2 0 0 

TTCTCTATCAAAGCAACCCACCTCCCAATCCAAAGGGGAG 24 0 

C CGACAGGC CCG AAGGAAACT AGTGGC C AC CAT C ACCAT C 2 8 0 

ACCATTAA 2 8 8 



Protein sequencefSeq. ID. No. 23) 

Mutated amino-acids in Tat sequences are in bold. 

MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFIT 4 0 
AALGISYGRKKRRQRRRPPQGSQTHQVSLSKQPTSQSKGE 8 0 

PTGPKETSGHHHHHH . 9 5 



=>Nef- Tat-Mutant-HIS 

DNA sequence(Seq„ ID. No. 24) 

ATGGGTGG CAAGTGGTCAAAAAGTAGTGTGGTTGGATGG C 4 0 
CTACTGTAAGGGAAAGAATGAGACGAGCTGAGCCAGCAGC 8 0 
AGATGGGGTGGGAGCAGCATCTCGAGACCTGGAAAAACAT 120 
GGAGCAAT CACAAGTAGCAATACAG C AG CTAC CAATGCTG 160 
CTTGTGCCTGGCTAGAAGCACAAGAGGAGGAGGAGGTGGG 2 00 
TTTTCCAGTCACACCTCAGGTACCTTTAAGACCAATGACT 2 4 0 
TACAAGGC AGCTGTAGATCTTAGC CACTTTTTAAAAGAAA 2 80 
AGGGGGGACTGGAAGGGCTAATTCACT C CCAACGAAGAC A 3 20 
AGAT AT CCTTGAT CTGTGG AT CTAC C ACACAC AAGGCTAC 3 60 
TTC CCTGATTGGC AGAACTAC ACACC AGGGC CAGGGGTCA 4 0 0 
GAT AT C C ACTGA.C CTTTGG ATGGTGCTAC AAG CT AGT AC C 44 0 
AGTTGAGC C AG AT AAGGTAGAAGAGGC C AATAAAGG AG AG 4 8 0 
AACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 52 0 
ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAG 56 0 
CCGCCTAGCATTTCATCACGTGGCCCGAGAGCTGCATCCG 6 00 
G AGTACTT C AAGAACTG C ACT AGTG AG CCAGT AG AT CCT A 64 0 
GACTAGAGCC CTGG AAG CAT CCAGGAAGTCAGCCTAAAAC 6 8 0 
TGCTTGTACCAATTGCTATTGTAAAAAGTGTTGCTTTCAT 72 0 
TG C CAAGTTTGTTT C AT AAC AG CTGC CTTAGGC AT CT CCT 76 0 
ATGGCAGGAAGAAGCGGAGACAGCGACGAAGACCTCCTCA 8 0 0 
AGGCAGTCAGACTCATCAAGTTTCTCTATCAAAGCAACCC 84 0 
ACCTCCCAATCCAAAGGGGAGCCGACAGGCCCGAAGGAAA 8 8 0 
CT AGTGG CC AC CAT C ACCAT C A C C ATT AA 90 9 
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Protein sequence (Seq. ID. No. 25) 
Mutated amino-acids in Tat sequence are in bold. 

MGGKWS KS S WGWP T VRERMRRAE P AADG VG AAS RDLE KH 4 0 
GAITSSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMT 8 0 

YKAAVDLSHFL KEKGGLEGL I HSQRRQD I LDLW I YHTQGY 12 0 

FPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGE 16 0 

NTSLLHPVSLHGMDDPEREVLEWRFDSRIiAFHHVARELHP 2 00 

EYFKNCTSEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFH 24 0 

CQVCFITAALGISYGRKKRRQRRRPPQGSQTHQVSLSKQP 2 80 

TSQSKGEPTGPKETSGHHHHHH . 3 02 
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Fig . 5 SDS-PAGE: Nef-Tat-his fusion protein 





Coomassie blue G250 



1: MW (175/83/62,5/47,5/32,5/25/16,5/6,5 kDa) 

2: TNH/23 SP eluate (4 |jg) 

3: TNH/23 Superdex200 eluate (4 ug) 

4: TNH/23 Purified bulk (4 ug) 

5: TNH/22 Purified bulk (4 ug) 

6: TNH/23 Purified bulk (4 ug) / non reducing conditions 
7: TNH/22 Purified bulk (4 ug) / non reducing conditions 
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Fig. 8 



Cell binding assay 
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Fig. 9 Inhibition of cell growth 
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